TypeError: can't use a string pattern on a bytes-like object in Python
Hello you guys, I am a newbie in Python and I'm also studying more about Python.
I have a small code function, I want to get all link from an original link. Like this:
url = 'http://google.com'
linkregex = re.compile('<a\s*href=[\'|"](.*?)[\'"].*?>')
m = urllib.request.urlopen(url)
msg = m.read()
links = linkregex.findall(msg)
print(links)
But I get an TypeError: can't use a string pattern on a bytes-like object when I run code above.
TypeError: can't use a string pattern on a bytes-like object
And I am using python 3.8.2
Anyone can explain it to me? How can I solve it?
Thanks for any response.
-
G-1
Gerardo Valle Sep 16 2021
You have used the string pattern for the bytes object. Use the byte pattern instead:You have used the string pattern for the bytes object. Use the byte pattern instead:
Replace:
linkregex = re.compile('<a\s*href=[\'|"](.*?)[\'"].*?>')
To
linkregex = re.compile(b'<a\s*href=[\'|"](.*?)[\'"].*?>')
I hope it useful for you.
-
đ-1
đặng thái sơn Sep 16 2021
The url you have for google didn't work for me, so I replaced it
http://www.google.com/ig?hl=en
which works for me.Try it:
import re import urllib.request url="http://www.google.com/ig?hl=en" linkregex = re.compile('<a\s*href=[\'|"](.*?)[\'"].*?>') m = urllib.request.urlopen(url) msg = m.read(): links = linkregex.findall(str(msg)) print(links)
* Type maximum 2000 characters.
* All comments have to wait approved before display.
* Please polite comment and respect questions and answers of others.