previous | start | next

Java Regular Expression Notation

Characters

      \t           The tab character                
      \n           The newline (line feed) character  
      \r           The carriage-return character            
      \f           The form-feed character          
      \a           The alert (bell) character       
      \e           The escape character               
   

Character Classes

      [abc]        a, b, or c (simple class)
      [^abc]       Any character except a, b, or c (negation)
      [a-zA-Z]     a through z or A through Z, inclusive (range)
   

Predefined Character Classes

      .            Any character (may or may not match line terminators)
      \d           A digit: [0-9]                                             
      \D           A non-digit: [^0-9]                                
      \s           A whitespace character: [ \t\n\x0B\f\r]            
      \S           A non-whitespace character: [^\s]                  
      \w           A word character: [a-zA-Z_0-9]                             
      \W           A non-word character: [^\w]                          
   

Warning: \ is used as an escape for characters; that is, '\n' or "\n" doesn't mean the character n. The usual meaning of character n is "escaped" and '\n' has the special meaning: the newline character.

The predefined character classes are not single characters, so this interfers with the use of \ as a escape character. That is, \s has no special meaning as a single character. The way Java deals with this is to escape the \ inside a string:

      "n"   string with 1 n
      "\n"  string with 1 newline
      "s"   string with 1 s
      "\s"  Error \s is an illegal escaped character
      "\\s" String indicating the character class of any whitespace character

   


previous | start | next