Analisi di una stringa con più livelli di coppie di valori chiave

Question

Analisi di una stringa con più livelli di coppie di valori chiave

#1 da (0 voti)

-1

Ho bisogno di scrivere una parte del programma che funzioni con una stringa, trovare alcune sottostringhe e copiarle nel dizionario. Esempio della stringa:

thestring ='\
#: somethings\nchars0 "substr0"\nchars1 "substr1"\n\n\
#: something\nchars0 "substr2"\nchars1 "substr3"'

quindi ha sottostringhe correnti 'chars0' e 'chars1' che conosco e alcune sottostringhe casuali come 'substr0' ...

Ma è un piccolo problema - all'inizio della struttura di stringhe di anacardi:

'chars0 ""\nchars1 ""\n"words\n"\n"else words\n"\n ... '

e quindi stringa strutturata come sopra.

Conosco solo il numero di queste sottostringhe con le parole, ma non mi interessano queste sottostringhe, ho bisogno di sottostringhe solo dalla parte strutturata della stringa.

Se stampa la stringa, otteniamo questo:

chars0 ""
chars1 ""
"some words\n"
"else words\n"
... 

#: something
chars0 "subtring"
chars1 "else substring"

...

Come posso ottimizzare la ricerca di esso? (Penso che sia bello provare find () metodo string o rfind () perché la fine della stringa è strutturata)

python strings

posta Illia Ananich 10.05.2016 - 05:36

fonte

1 risposta

Leggi altre domande sui tag python strings

L'approccio migliore per creare un report basato su dati dinamici Caricamento di file di grandi dimensioni sul server Web anche se il browser è chiuso

score 0 · Accepted Answer

Un modo molto semplice per analizzare questa stringa potrebbe essere

thestring ='\
#: somethings\nchars0 "substr0"\nchars1 "substr1"\n\n\
#: something\nchars0 "substr2"\nchars1 "substr3"'

for token in thestring.split('\n'):   
  if "#" in token:
    handler.heading(token)    
  elif "" == token:
    handler.section(token)
  else:
    handler.body(token)

print handler.dictOfDicts

Questo stamperà

{'#: something': {'chars0 ': 'substr2', 'chars1 ': 'substr3'},
'#: somethings': {'chars0 ': 'substr0', 'chars1 ': 'substr1'}}

o

heading: #: somethings
key: chars0
value: substr0
key: chars1
value: substr1
section:
heading: #: something
key: chars0
value: substr2
key: chars1
value: substr3

a seconda che handler sia impostato su StringStructure() o StringPrinter() , rispettivamente. Dove

class StringStructure:
  dictOfDicts = {}  
  currentHeading = ''
  def heading( self, token ):
    self.currentHeading = token
    self.dictOfDicts[self.currentHeading] = {}
  def section( self, token ):
    pass
  def body( self, token ):
    key, value, junk = token.split('"')
    self.dictOfDicts[self.currentHeading][key] = value    

class StringPrinter:
  dictOfDicts = ''
  def heading( self, token ):
    print 'heading: ' + token
  def section( self, token ):
    print 'section: ' + token
  def body( self, token ):
    key, value, junk = token.split('"')
    print 'key: ' + key
    print 'value: ' + value

Questo è fragile perché presuppone che avrai sempre due elementi nel corpo, due livelli di chiavi, e non produrrà i messaggi di errore più significativi quando il formato della tua stringa è sbagliato.