python 从文本中提取每一行的特定字符串输出到csv文件

文本内容如下:

12-06 14:50:23.600: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +125ms
12-06 14:50:52.581: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +126ms
12-06 14:51:21.391: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +108ms
12-06 14:51:50.652: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +121ms

想使用python截取每一行中的特定数据,然后把它们写入到csv文件中!
想截取每一行中的3段字符串 “numberlocation” “NumberLocationActivity” “125”

在CSV中看到的期待结果是(一行输出3段字符串):

numberlocation NumberLocationActivity 125

假设 你的文件名叫t.txt在当前目录下,

输出的CSV叫csv.txt也在当前目录

代码如下

import sys
import re
import csv 
pattern=r'.*(numberlocation)/\.(NumberLocationActivity).*\+(.*)ms'
cs=open('./csv.txt','w')
csvw=csv.writer(cs)

f=open('./t.txt')
for line in f:
 m=re.match(pattern,line)
 csvw.writerow(m.group(1,2,3))                                                                                                                              
f.close()
cs.close()追问

这样的怎么处理?12-09 15:05:45.748: I/ActivityManager(557): Displayed com.android.phone/.PrivilegedOutgoingCallBroadcaster: +388ms
内容会变化这里面phone,PrivilegedOutgoingCallBroadcaster,388是需要提取出来的

追答

改一下正则式就可以了.

把

pattern=r'.*(numberlocation)/\.(NumberLocationActivity).*\+(.*)ms'

改成

pattern=r'.*\.(.*)/\.(.*).*\+(.*)ms'
温馨提示:答案为网友推荐,仅供参考
第1个回答  2013-12-06
#!/usr/bin/python
# coding: utf-8
#
# filename: regexTester.py
# author: Tim Wang
# date: Dec., 2013

import re

context = """12-06 14:50:23.600: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +125ms
12-06 14:50:52.581: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +126ms
12-06 14:51:21.391: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +108ms
12-06 14:51:50.652: I/ActivityManager(605): Displayed com.suning.numberlocation/.NumberLocationActivity: +121ms"""


patt = re.compile(r"""
    (?P<dt>\d{1,2}-\d{2}\s\d{1,2}:\d{2}:\d{2}\.\d{3})
    .*
    (?<=NumberLocationActivity:\s\+)(?P<numberlocation>\d+)ms
    """, re.I|re.U|re.X)

outputfmt = "numberlocation     NumberLocationActivity   %(numberlocation)s"
for ln in context.splitlines():
    print outputfmt % patt.match(ln).groupdict()

相似回答