在R中,有没有办法使用正则表达式或类似的东西来提取电子邮件字符串的第一个和最后一个字符?

时间:2022-09-13 14:17:43

Currently in R, with data.table, I have the following column:

目前在R中,有data.table,我有以下专栏:

[email protected]       
[email protected]        
[email protected]
j*****[email protected]

which are factors. I would like to parse the above so that I can get the first and last letters of the username before the @ symbol.

哪些是因素。我想解析上面的内容,以便我可以在@符号之前获取用户名的第一个和最后一个字母。

So for the above I'd like to get:

所以对于上面我想得到:

jn
be
cr
jn

I deal with some asterisked usernames so I added it in too. Is there a simple way to do this? Any thoughts would be greatly appreciated.

我处理了一些带星号的用户名,所以我也添加了它。有一个简单的方法吗?任何想法将不胜感激。

1 个解决方案

#1


4  

Match the following pattern to the strings and replace it with the capture groups:

将以下模式与字符串匹配,并将其替换为捕获组:

sub("(.).*(.)@.*", "\\1\\2", s)
## [1] "jn" "be" "cr" "jn"

Note

The input strings in reproducible form is:

可重现形式的输入字符串是:

s <- c("[email protected]", "[email protected]", "[email protected]",
  "j*****[email protected]")

#1


4  

Match the following pattern to the strings and replace it with the capture groups:

将以下模式与字符串匹配,并将其替换为捕获组:

sub("(.).*(.)@.*", "\\1\\2", s)
## [1] "jn" "be" "cr" "jn"

Note

The input strings in reproducible form is:

可重现形式的输入字符串是:

s <- c("[email protected]", "[email protected]", "[email protected]",
  "j*****[email protected]")