1,txt文件是什么格式你没说明白, 是一行一个单词,还是空格/tab隔开;
2,按出现频率,具体什么频率呢。。。
不过我可以给你几个建议。读取txt文件用BufferedReader是最好的,他可以一次读取一行。
BufferedReader br = new BufferedReader(new InputStreamReader(new File("d:/*.txt")));
然后就是br.readLine(); 一行一行读取了
统计频率我也不清楚什么频率;你需要把问题描述清楚,可以直接email告诉我,我再帮你:
[email protected] import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Iterator;
import java.util.Map;
import java.util.StringTokenizer;
import java.util.TreeMap;
/**
* @author <a href="mailto:
[email protected]">sawen21</a>
*
*/
public class WordStatistics {
/**
* @param args
* Write a program that reads words from a text file and displays all
* the words and their occurrences in ascending alphabetical order.
* The text file is passed as a command-line argument.
*
*
* results: a 3
all 2
alphabetical 1
and 2
argument 1
as 1
ascending 1
command-line 1
display 1
displays 1
file 3
from 1
in 1
is 1
occurrences 1
order 1
passed 1
program 1
reads 1
text 2
that 1
the 3
their 1
words 2
Write 1
*
*
*/
public static void main(String[] args) {
if(args.length==0)throw new IllegalArgumentException("file name is required!");
File file = new File(args[0]);
//如果大小写字母敏感 排序 只要把String.CASE_INSENSITIVE_ORDER 去掉即可;
Map<String,Integer> words = new TreeMap<String,Integer>(String.CASE_INSENSITIVE_ORDER);
try {
BufferedReader br = new BufferedReader(new FileReader(file));
String str = br.readLine();
while(str!=null && str.length()>0){
StringTokenizer st = new StringTokenizer(str," ");
while(st.hasMoreTokens()){
String word = st.nextToken();
if(words.containsKey(word)){//if map already contains word
int wordCount = words.get(word);
words.put(word, wordCount+1);
}else{
words.put(word, 1);
}
}
str = br.readLine();
}
} catch (FileNotFoundException e) {
System.out.println("file does not exist!");
} catch (IOException e) {
e.printStackTrace();
}
Iterator<String> iter = words.keySet().iterator();
while(iter.hasNext()){
String word = iter.next();
System.out.println(word +" " + words.get(word));
}
}
}