栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

如何在源代码中找到所有注释?

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

如何在源代码中找到所有注释?

为了可靠地在Java源文件中找到所有注释,我不会使用regex,而是使用真正的词法分析器(aka Tokenizer)。

Java的两个流行选择是:

  • JFlex:http://jflex.de
  • ANTLR:http://www.antlr.org

与流行的看法相反,ANTLR也可用于 创建词法分析器而不使用语法分析器。

这是ANTLR快速演示。您需要在同一目录中包含以下文件:

  • antlr-3.2.jar
  • JavaCommentLexer.g(语法)
  • Main.java
  • Test.java(有效(!)的Java源文件,带有奇异注释)

JavaCommentLexer.g

lexer grammar JavaCommentLexer;options {  filter=true;}SingleLineComment  :  FSlash FSlash ~('r' | 'n')*  ;MultiLineComment  :  FSlash Star .* Star FSlash  ;StringLiteral  :  DQuote     ( (EscapedDQuote)=> EscapedDQuote     | (EscapedBSlash)=> EscapedBSlash     | Octal     | Unipre     | ~('\' | '"' | 'r' | 'n')     )*     DQuote {skip();}  ;CharLiteral  :  SQuote     ( (EscapedSQuote)=> EscapedSQuote     | (EscapedBSlash)=> EscapedBSlash     | Octal     | Unipre     | ~('\' | ''' | 'r' | 'n')     )     SQuote {skip();}  ;fragment EscapedDQuote  :  BSlash DQuote  ;fragment EscapedSQuote  :  BSlash SQuote  ;fragment EscapedBSlash  :  BSlash BSlash  ;fragment FSlash  :  '/' | '\' ('u002f' | 'u002F')  ;fragment Star  :  '*' | '\' ('u002a' | 'u002A')  ;fragment BSlash  :  '\' ('u005c' | 'u005C')?  ;fragment DQuote  :  '"'   |  '\u0022'  ;fragment SQuote  :  '''   |  '\u0027'  ;fragment Unipre  :  '\u' Hex Hex Hex Hex  ;fragment Octal  :  '\' ('0'..'3' Oct Oct | Oct Oct | Oct)  ;fragment Hex  :  '0'..'9' | 'a'..'f' | 'A'..'F'  ;fragment Oct  :  '0'..'7'  ;

Main.java

import org.antlr.runtime.*;public class Main {  public static void main(String[] args) throws Exception {    JavaCommentLexer lexer = new JavaCommentLexer(new ANTLRFileStream("Test.java"));    CommonTokenStream tokens = new CommonTokenStream(lexer);      for(Object o : tokens.getTokens()) {      CommonToken t = (CommonToken)o;      if(t.getType() == JavaCommentLexer.SingleLineComment) {        System.out.println("SingleLineComment :: " + t.getText().replace("n", "\n"));      }      if(t.getType() == JavaCommentLexer.MultiLineComment) {        System.out.println("MultiLineComment  :: " + t.getText().replace("n", "\n"));      }    }  }}

Test.java

u002fu002a <- multi line comment startmultilinecomment // not a single line commentu002A/public class Test {  // single line "not a string"  String s = "u005C" 242 not // a comment \" u002f u005Cu005C u0022;    char c = u0027"'; // the " is not the start of a string  char q1 = 'u005c'';       // == '''  char q2 = 'u005cu0027';  // == '''  char q3 = u0027u005cu0027u0027;   // == '''  char c4 = '47';  String t = "";}

现在,要运行演示,请执行以下操作:

bart@hades:~/Programming/ANTLR/Demos/JavaComment$ java -cp antlr-3.2.jar org.antlr.Tool JavaCommentLexer.gbart@hades:~/Programming/ANTLR/Demos/JavaComment$ javac -cp antlr-3.2.jar *.javabart@hades:~/Programming/ANTLR/Demos/JavaComment$ java -cp .:antlr-3.2.jar Main

并且您将看到以下内容打印到控制台:

MultiLineComment  :: u002fu002a <- multi line comment startnmultinlinencomment // not a single line commentnu002A/SingleLineComment :: // single line "not a string"SingleLineComment :: // a comment \" u002f u005Cu005C u0022;MultiLineComment  :: SingleLineComment :: // the " is not the start of a stringSingleLineComment :: // == '''SingleLineComment :: // == '''SingleLineComment :: // == '''SingleLineComment :: u002fu002f another single line comment

编辑

当然,您可以使用正则表达式自己创建一种词法分析器。但是,以下演示不处理源文件中的Unipre文字:

Test2.java

public class Test2 {  // single line "not a string"  String s = "" 242 not // a comment \" ";    char c = '"'; // the " is not the start of a string  char q1 = ''';       // == '''  char c4 = '47';  String t = "";}

Main2.java

import java.util.*;import java.io.*;import java.util.regex.*;public class Main2 {  private static String read(File file) throws IOException {    StringBuilder b = new StringBuilder();    Scanner scan = new Scanner(file);    while(scan.hasNextLine()) {      String line = scan.nextLine();      b.append(line).append('n');    }    return b.toString();  }  public static void main(String[] args) throws Exception {    String contents = read(new File("Test2.java"));    String slComment = "//[^rn]*";    String mlComment = "/\*[\s\S]*?\*/";    String strLit = ""(?:\\.|[^\\"rn])*"";    String chLit = "'(?:\\.|[^\\'rn])+'";    String any = "[\s\S]";    Pattern p = Pattern.compile(        String.format("(%s)|(%s)|%s|%s|%s", slComment, mlComment, strLit, chLit, any)    );    Matcher m = p.matcher(contents);    while(m.find()) {      String hit = m.group();      if(m.group(1) != null) {        System.out.println("SingleLine :: " + hit.replace("n", "\n"));      }      if(m.group(2) != null) {        System.out.println("MultiLine  :: " + hit.replace("n", "\n"));      }    }  }}

如果运行

Main2
,则会在控制台上打印以下内容:

MultiLine  :: SingleLine :: // single line "not a string"MultiLine  :: SingleLine :: // the " is not the start of a stringSingleLine :: // == '''SingleLine :: // another single line comment


转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/509526.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号