I have lines of data which I want to parse. The data looks like this:
a score=216 expect=1.05e-06 a score=180 expect=0.0394
What I want to do is to have a subroutine that parse them and return 2 values (score and expect) for each line.
However this function of mine doesn’t seem to work:
def scoreEvalFromMaf(mafLines): for word in mafLines[0]: if word.startswith("score="): theScore = word.split('=')[1] theEval = word.split('=')[2] return [theScore, theEval] raise Exception("encountered an alignment without a score")
Please advice what’s the right way to do it?
Advertisement
Answer
It looks like you want to split each line up by spaces, and parse each chunk separately. If mafLines is a string (ie. one line from .readlines()
:
def scoreEvalFromMafLine(mafLine): theScore, theEval = None, None for word in mafLine.split(): if word.startswith("score="): theScore = word.split('=')[1] if word.startswith("expect="): theEval = word.split('=')[1] if theScore is None or theEval is None: raise Exception("Invalid line: '%s'" % line) return (theScore, theEval)
The way you were doing it would iterate over each character in the first line (since it’s a list of strings) rather than on each space.