📜  如何在 Golang 中使用正则表达式拆分文本?

📅  最后修改于: 2021-10-25 02:11:24             🧑  作者: Mango

什么是正则表达式?正则表达式是著名的动漫字符吗?哎呀,如果你这么想,你很可能会失望。 Go 编程语言使用术语regexp来表示正则表达式正则表达式在字符串处理领域非常重要。 Go 中的“ regexp ”包包含所有必要的预构建函数,这些函数实现正则表达式搜索,并保证在提供的输入大小上进行线性时间搜索。要了解有关正则表达式的更多信息,请阅读Golang 中的正则表达式是什么?

如何使用正则表达式(或正则表达式)分割输入文本?

Regexp 包包含 Split函数,它有助于拆分输入文本字符串。在我们深入了解 regexp split函数,让我们向您简要介绍一些在使用 split函数值得记住的基本正则表达式:

Character(s) What it signifies? Example syntax Result of regex

[ ]

 [ ] Can be used for including or
 excluding a given range, or even
 mention specifically the characters
 that we want to include or exclude.
 [ ] means both inclusive in a range.

“[b-f]an”

ban, can, dan,
ean, fan

{ }

 The curly brackets are used when
 one needs to match the frequency
 of the occurrence of any given 
 expression.

“gf{1,}g”

gfg, gffg, gfffg,…

( )

  ( ) Can be used for including or
 excluding a given range, or even
 mention specifically the
 characters that we want to
 include or exclude. ( ) means
 numbered capturing group.

“(b-f)an”

ban, can, dan,
ean, fan

*

 * Matches 0/0+ occurrences
 of the character that precedes
 the star (‘*’).

“gee*k”

gek, geek, geeek…

+

 + Matches 1/1+ occurrences
 of the character that precedes
 the plus (‘+’).

“gee+k”

geek, geeek,…

?

 ? Matches 0 & 1 occurrences
 of the character that precedes
 the question mark (‘?’).

“gee?k”

gek, geek

.

 Anything can replace the dot
 character, except a newline (\n).

“g.g”

gfg, gbg, gcg,…

^

 Any text that starts with the string
 mentioned after ‘^’.

 Also used as a negation in groups
 or other regular expressions (NOT).

“^ge”

“[^0-8]*”

ge, geek, geeks,…

‘ ‘,9, 99,999,…

$

 It denotes the end of the string in a 
 single-lined text and the end of a
 line in a multi-line text.

“$de”

code, decode,…

|

 | is the or operator. “geek|principle” geek, geeks,
principle, principles..

\

 This is the escape character
 If you need to mention anything
 like ‘\s’ then you need to
 use ‘ \\s’ so that the system
 understands that it is ‘\s’.

“\A”
“\\n”
“\\s”
 etc..

 

\s

 Extracts all white spaces only

“\\s”

” “, ”  “, ”   “,…

\S

 Extracts all text except white spaces

“\\S”

 

\d

 Extracts all digits only

“\\d”

0, 1, 2, 3, 4, 5,
6, 7, 8, 9

\D

 Extracts all text except digits

“\\D”

 

句法:

func Split(s: string, n: int) []string

此函数接受一个字符串和一个整数,并返回所有子字符串的切片。输入字符串‘是将进一步,分裂是成子按照由Split函数给定正则表达式的字符串。 ‘ n ‘ 表示决定要返回的子串数量的值。

  • 如果n > 0 :表示最多有 n 个子串进行正则运算并返回。
  • 如果n = 1则不执行正则表达式操作,因此将返回原始字符串。
  • 如果n = 0 :表示不返回子字符串,应返回 nill。
  • 如果n < 0 :表示创建的所有子串都将由函数返回。

示例 1:

Go
package main
  
import (
    f "fmt"
    re "regexp"
    // we import the regexp package
    // as re
)
  
// SPLIT function hacks
// In layman terms, whenever the
// string (given to split as argument)
// occurs, the string is split into
// a substring.
  
func main() {
  
    // str stores a sample string as shown below
    str := "I am at GFG!\nYou can call me at 9087651234."
      
    // just printing the original string: str
    f.Println(str)
  
    // We shall consider two scenarios
    // in this example code
  
    // Scenario 1:
    // Store a regex object that
    // contains digits only
    // in obj1.
    obj1 := re.MustCompile("\\d*")
  
    // Scenario 2:
    // Store a regex object that
    // contains everything except digits
    // in obj2.
    obj2 := re.MustCompile("\\D*")
  
    // Using obj1 as reference, we
    // are splitting the string: str
    // and -1 denotes that all substrings
    // ever created will be included.
    first := obj1.Split(str, -1)
      
    // "first" holds all the substrings
    // split from str w.r.t obj1 in a container
    // A container like, say a list.
  
    // Using obj2 as reference, we
    // are splitting the string: str
    // and -1 denotes that all substrings
    // ever created will be included.
    second := obj2.Split(str, -1)
    // "second" holds all the substrings
    // split from str w.r.t obj2 in a container
    // A container like, say a list.
  
    f.Println("Now printing text split by obj1...")
    for _, p := range first {
        f.Println(p)
    }
  
    f.Println("Now printing text split by obj2...")
    for _, q := range second {
        f.Println(q)
    }
}


Go
package main
  
import (
    f "fmt"
    re "regexp"
)
  
// Simple example code to understand
// 1. Function of Split function
// 2. Parameters of split function
// regex-object.Split(string: , n: )
  
func main() {
  
    // Sample string that will be used in this
    // example "GeeksforGeeks loves bananas"
    str := "GeeksforGeeks loves bananas"
    f.Println(str)
  
    f.Println("Part-1: Excluding all vowels from given string")
      
    // a regexp object (geek) for storing all vowels
    geek := re.MustCompile("[aeiou]")
    f.Print("Printing all substring lists = ")
      
    // Checking split for n = -1
    f.Println(geek.Split(str, -1))
    f.Print("For n = 0 substring list = ")
      
    // Checking split for n = 0
    f.Println(geek.Split(str, 0))
    f.Print("For n = 1 substring list = ")
      
    // Checking split for n = 1
    f.Println(geek.Split(str, 1))
    f.Print("For n = 10 substring list = ")
      
    // Checking split for n = 10
    f.Println(geek.Split(str, 10))
    f.Print("For n = 100 substring list = ")
      
    // Checking split for n = 100
    f.Println(geek.Split(str, 100))
  
    f.Println("\n\nPart-2: Extracting all vowels from given string")
      
    // a regexp object (geek) for storing all consonants
    geek = re.MustCompile("[^aeiou]")
      
    f.Print("Printing all substring lists = ")
      
    // Checking split for n = -1
    f.Println(geek.Split(str, -1))
    f.Print("For n = 0 substring list = ")
      
    // Checking split for n = 0
    f.Println(geek.Split(str, 0))
    f.Print("For n = 1 substring list = ")
      
    // Checking split for n = 1
    f.Println(geek.Split(str, 1))
    f.Print("For n = 10 substring list = ")
      
    // Checking split for n = 10
    f.Println(geek.Split(str, 10))
    f.Print("For n = 100 substring list = ")
      
    // Checking split for n = 100
    f.Println(geek.Split(str, 100))
  
    // Did you notice that split function
    // does not modify the original regex
    // matching object?
}


执行命令:

> go run (your_file_name).go

输出:

I am at GFG!
You can call me at 9087651234.
Now printing text split by obj1...
I

a
m

a
t

G
F
G
!


Y
o
u

c
a
n

c
a
l
l

m
e

a
t

.
Now printing text split by obj2...

9
0
8
7
6
5
1
2
3
4

Visual Studio Code 上的 Visual I/O 演示:

运行上述编写的代码(code-1)时,屏幕上的代码输出

运行上面编写的代码时在屏幕上输出代码。

示例 2:

package main
  
import (
    f "fmt"
    re "regexp"
)
  
// Simple example code to understand
// 1. Function of Split function
// 2. Parameters of split function
// regex-object.Split(string: , n: )
  
func main() {
  
    // Sample string that will be used in this
    // example "GeeksforGeeks loves bananas"
    str := "GeeksforGeeks loves bananas"
    f.Println(str)
  
    f.Println("Part-1: Excluding all vowels from given string")
      
    // a regexp object (geek) for storing all vowels
    geek := re.MustCompile("[aeiou]")
    f.Print("Printing all substring lists = ")
      
    // Checking split for n = -1
    f.Println(geek.Split(str, -1))
    f.Print("For n = 0 substring list = ")
      
    // Checking split for n = 0
    f.Println(geek.Split(str, 0))
    f.Print("For n = 1 substring list = ")
      
    // Checking split for n = 1
    f.Println(geek.Split(str, 1))
    f.Print("For n = 10 substring list = ")
      
    // Checking split for n = 10
    f.Println(geek.Split(str, 10))
    f.Print("For n = 100 substring list = ")
      
    // Checking split for n = 100
    f.Println(geek.Split(str, 100))
  
    f.Println("\n\nPart-2: Extracting all vowels from given string")
      
    // a regexp object (geek) for storing all consonants
    geek = re.MustCompile("[^aeiou]")
      
    f.Print("Printing all substring lists = ")
      
    // Checking split for n = -1
    f.Println(geek.Split(str, -1))
    f.Print("For n = 0 substring list = ")
      
    // Checking split for n = 0
    f.Println(geek.Split(str, 0))
    f.Print("For n = 1 substring list = ")
      
    // Checking split for n = 1
    f.Println(geek.Split(str, 1))
    f.Print("For n = 10 substring list = ")
      
    // Checking split for n = 10
    f.Println(geek.Split(str, 10))
    f.Print("For n = 100 substring list = ")
      
    // Checking split for n = 100
    f.Println(geek.Split(str, 100))
  
    // Did you notice that split function
    // does not modify the original regex
    // matching object?
}

执行命令:

> go run (your_file_name).go

输出:

GeeksforGeeks loves bananas
Part-1: Excluding all vowels from given string
Printing all substring lists = [G  ksf rG  ks l v s b n n s]
For n = 0 substring list = []
For n = 1 substring list = [GeeksforGeeks loves bananas]
For n = 10 substring list = [G  ksf rG  ks l v s b n nas]
For n = 100 substring list = [G  ksf rG  ks l v s b n n s]


Part-2: Extracting all vowels from given string
Printing all substring lists = [ ee   o  ee    o e   a a a ]
For n = 0 substring list = []
For n = 1 substring list = [GeeksforGeeks loves bananas]
For n = 10 substring list = [ ee   o  ee   loves bananas]
For n = 100 substring list = [ ee   o  ee    o e   a a a ]

Visual Studio Code 上的 Visual I/O 演示:

运行上述编写的代码(代码 2)时,屏幕上的代码输出

运行上面编写的代码时在屏幕上输出代码。