码迷,mamicode.com
首页 > 编程语言 > 详细

Core Python | 2 - Core Python: Getting Started | 2.4 - Introducing Strings, Collections, and Iteration | 2.4.4 - Bytes

时间:2021-03-06 14:52:54      阅读:0      评论:0      收藏:0      [点我收藏+]

标签:contains   ble   nbsp   you   ams   repr   app   png   opera   

Bytes are very similar to strings, except that rather than being sequences of Unicode code points, they are sequences of, well, bytes. As such, they are used for raw binary data and fixed?width single?byte character encodings such as ASCII. As with strings, they have a simple, literal form using quotes, the first of which is prefixed by a lower case b. There is also a bytes constructor, but it‘s an advanced feature and we won‘t cover it in this fundamentals course. At this point, it‘s sufficient for us to recognize bytes literals, and understand that they support most of the same operations as string, such as indexing, which returns the integer value of the specified byte, and splitting, which you‘ll see returns a list of bytes objects. To convert between bytes and strings, we must know the encoding of the byte sequence used to represent the string‘s Unicode code points as bytes. Python supports a wide variety of encodings, a full list of which can be found at python.org. Let‘s start with an interesting Unicode string which contains all the characters of the 29?letter Norwegian alphabet, a pangram. We‘ll now encode that using UTF?8 into a bytes object. See how the Norwegian characters have each been rendered as pairs of bytes. We can reverse that process using the decode method of the bytes object. Again, we must supply the correct encoding. We can check that the result is equal to what we started with, and display it for good measure. This may seem like an unnecessary detail so early in the course, especially if you operate in an anglophone environment, but it‘s a crucial point to understand since files and network resources such as HTTP responses are transmitted as byte streams, whereas we often prefer to work with the convenience of Unicode strings.

 

字节(bytes)与字符串(str)

字符串就是你说的话,human-reable,比如:I want my mother to call me eight times a day

当你要把这句话存储到计算机硬盘上,你要将它转换为计算机能懂的话,就是机器语言,"01010101...",它由若干个比特(一个比特就是一个0或者一个1)组成,8个比特组成1个字节。

将huaman-readable转换成机器语言的过程叫做encode编码,反之叫做decode解码。文件和网络资源(如HTTP响应)是以字节流的形式传输的,所以这时候,也需要进行encode和decode转换

在Python3中,当涉及到encode和decode的时候,它会以默认的UTF-8的编码方法自动帮你encode和decode

技术图片

 

 

 Python3可以直接对字节进行操作,方法类似于字符串,例如索引、切片。不过,为什么不把字节先decode成字符串,操作完之后,再encode回去呢?这样更直观

 

Bytes are very similar to strings, except that rather than being sequences of Unicode code points, they are sequences of, well, bytes. As such, they are used for raw binary data and fixed?width single?byte character encodings such as ASCII. As with strings, they have a simple, literal form using quotes, the first of which is prefixed by a lower case b. There is also a bytes constructor, but it‘s an advanced feature and we won‘t cover it in this fundamentals course. At this point, it‘s sufficient for us to recognize bytes literals, and understand that they support most of the same operations as string, such as indexing, which returns the integer value of the specified byte, and splitting, which you‘ll see returns a list of bytes objects. To convert between bytes and strings, we must know the encoding of the byte sequence used to represent the string‘s Unicode code points as bytes. Python supports a wide variety of encodings, a full list of which can be found at python.org. Let‘s start with an interesting Unicode string which contains all the characters of the 29?letter Norwegian alphabet, a pangram. We‘ll now encode that using UTF?8 into a bytes object. See how the Norwegian characters have each been rendered as pairs of bytes. We can reverse that process using the decode method of the bytes object. Again, we must supply the correct encoding. We can check that the result is equal to what we started with, and display it for good measure. This may seem like an unnecessary detail so early in the course, especially if you operate in an anglophone environment, but it‘s a crucial point to understand since files and network resources such as HTTP responses are transmitted as byte streams, whereas we often prefer to work with the convenience of Unicode strings.

Core Python | 2 - Core Python: Getting Started | 2.4 - Introducing Strings, Collections, and Iteration | 2.4.4 - Bytes

标签:contains   ble   nbsp   you   ams   repr   app   png   opera   

原文地址:https://www.cnblogs.com/hmlhml/p/14489434.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!