Skip to content
Advertisement

Parsing line with delimiter in Python

I have lines of data which I want to parse. The data looks like this:

a score=216 expect=1.05e-06
a score=180 expect=0.0394

What I want to do is to have a subroutine that parse them and return 2 values (score and expect) for each line.

However this function of mine doesn’t seem to work:

def scoreEvalFromMaf(mafLines):
    for word in mafLines[0]:
        if word.startswith("score="):
            theScore = word.split('=')[1]
            theEval  = word.split('=')[2]
            return [theScore, theEval]
    raise Exception("encountered an alignment without a score")

Please advice what’s the right way to do it?

Advertisement

Answer

It looks like you want to split each line up by spaces, and parse each chunk separately. If mafLines is a string (ie. one line from .readlines():

def scoreEvalFromMafLine(mafLine):
    theScore, theEval = None, None
    for word in mafLine.split():
        if word.startswith("score="):
            theScore = word.split('=')[1]
        if word.startswith("expect="):
            theEval  = word.split('=')[1]

    if theScore is None or theEval is None:
        raise Exception("Invalid line: '%s'" % line)

    return (theScore, theEval)

The way you were doing it would iterate over each character in the first line (since it’s a list of strings) rather than on each space.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement