在Python中,字符串是用于处理文本数据的基本数据类型。它们是不可变的Unicode点序列。创建字符串非常简单,只需用单引号或双引号将文本包围起来。Python对单引号和双引号的处理是相同的。本文将讨论Python中用于数据分析和数据操作的一些重要且实用的字符串函数,这些函数在自然语言处理(NLP)中尤为重要。
本文将讨论以下Python字符串函数:
capitalize() 函数返回一个字符串,其中第一个字符是大写的。
string = "analytics Vidhya is the Largest data science Community"
print(string.capitalize())
如果第一个字符是数字而不是字符,会发生什么?
string = '10 version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.capitalize())
输出结果将是:10 version of data science blogathon by analytics vidhya is very good
lower() 函数返回一个字符串,其中所有字符都是小写。这个函数不会对符号和数字做任何处理,即简单地忽略这些内容。
string = "Analytics Vidhya is the Largest Data Science Community"
print(string.lower())
如果字符串中包含数字,会发生什么?
string = '10 version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.lower())
输出结果将是:10 version of data science blogathon by analytics vidhya is very good
title() 函数返回一个字符串,其中每个单词的第一个字符都是大写的。这类似于标题或标题。
string = "analytics vidhya is the Largest data science Community"
print(string.title())
如果字符串中包含数字或符号,这个函数会将那个之后的字母转换成大写。
string = '10th version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.title())
casefold() 函数返回一个字符串,其中所有字符都是小写。这个函数类似于lower()函数,但casefold()函数更强大,更积极,意味着它将更多的字符转换成小写,并在比较两个字符串时找到更多的匹配。
string = "Analytics Vidhya is the Largest Data Science Community"
print(string.casefold())
如果字符串中包含数字,会发生什么?
string = '10th version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.casefold())
输出结果将是:10th version of data science blogathon by analytics vidhya is very good
upper() 函数返回一个字符串,其中所有字符都是大写。这个函数不会对符号和数字做任何处理,即简单地忽略这些内容。
string = "analytics Vidhya is the Largest Data Science Community"
print(string.upper())
如果字符串中包含数字,会发生什么?
string = '10th version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.upper())
输出结果将是:10TH VERSION OF DATA SCIENCE BLOGATHON BY ANALYTICS VIDHYA IS VERY GOOD
count() 函数找出指定值在给定字符串中出现的次数。
string = "analytics Vidhya is the Largest Analytics Community"
print(string.count("analytics"))
从位置10到18返回指定值出现的次数。
string = "analytics Vidhya is the Largest analytics Community"
print(string.count("analytics", 10, 18))
输出结果将是:0
find() 函数找出指定值的第一次出现。如果字符串中没有找到该值,则返回-1。
string = "analytics vidhya is the Largest data science Community"
print(string.find("d"))
如果只在位置5到16之间搜索,会发生什么?
string = "analytics vidhya is the Largest data science Community"
print(string.find("d", 5, 16))
如果值未找到,find()函数返回-1,但index()函数将引发异常。
string = "analytics vidhya is the Largest data science Community"
print(string.find("d", 5, 10))
输出结果将是:-1
replace() 函数将指定的短语替换为另一个指定的短语。
string = "analytics vidhya is the Largest data science Community"
print(string.replace("science", "scientists"))
只替换第一个出现的单词。
string = "Data science Courses by analytics vidhya are the best courses to learn Data science"
print(string.replace("science", "scientists", 1))
输出结果将是:Data scientists Courses by analytics vidhya are the best courses to learn Data science
swapcase() 函数返回一个字符串,其中所有大写字母都是小写,反之亦然。
string = "analytics vidhya is the Largest data science Community"
print(string.swapcase())
如果字符串中包含数字,会发生什么?
string = '10th version of Data Science Blogathon by Analytics Vidhya is very good'
print(string.swapcase())
输出结果将是:10TH VERSION OF dATA sCIENCE bLOGATHON BY aNALYTICS vIDHYA IS VERY GOOD
join() 函数将所有可迭代项连接成一个字符串。需要指定一个字符串作为分隔符。
myTuple = ("Data Scientists", "Machine Learning", "Data Science")
x = "#".join(myTuple)
print(x)
使用“TEST”作为分隔符,将字典中的所有项连接成一个字符串。
myDict = {"name": "Analytics Vidhya", "country": "India", "Technology": "Data Science"}
mySeparator = "TEST"
x = mySeparator.join(myDict)
print(x)
输出结果将是:nameTESTcountryTESTTechnology
还可以查看之前的博客文章。
LinkedIn:这是LinkedIn个人资料,如果想要与联系,将非常乐意。