Code: Select all
###randomly selects 100 sequences and adds them to the FASTA
def insert (source_str, insert_str, pos):
return source_str[:pos] + insert_str + source_str[pos:]
def get_retro_text(genome, all_strings):
string_of_choice = [string for string in all_strings if 400 < len(string) < 500]
hundred_strings = random.sample(string_of_choice, k=100)
text_of_strings = []
for k in range(len(hundred_strings)):
text_of_strings.append(str(hundred_strings[k].seq))
single_string = ",".join(text_of_strings)
new_genome = insert(genome, single_string, random.randint(0, len(genome)))
return new_genome
big_genome = get_retro_text(body, s)
Code: Select all
body
Code: Select all
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNtaaccctaaccctaacccta
accctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaacccta
accctaaccctaaccctaaccctaacccaaccctaaccctaaccctaaccctaaccctaa
ccctaacccctaaccctaaccctaaccctaaccctaacctaaccctaaccctaaccctaa
ccctaaccctaaccctaaccctaaccctaacccctaaccctaaccctaaaccctaaaccc
taaccctaaccctaaccctaaccctaaccccaaccccaaccccaaccccaaccccaaccc
caaccctaacccctaaccctaaccctaaccctaccctaaccctaaccctaaccctaaccc
taaccctaacccctaacccctaaccctaaccctaaccctaaccctaaccctaaccctaac
ccctaaccctaaccctaaccctaaccctcgCGGTACCCTCAGCCGGCCCGCCCGCCCGGG
TCTGACCTGAGGAGAACTGTGCTCCGCCTTCAGAGTACCACCGAAATCTGTGCAGAGGAc
aacgcagctccgccctcgcggtGCTCtccgggtctgtgctgaggagaacgCAACTCCGCC
GTTGCAAAGGCGcgccgcgccggcgcaggcgcagagaggcgcgccgcgccggcgcaggcg
cagagaggcgcgccgcgccggcgcaggcgcagagaggcgcgccgcgccggcgcaggcgca
gagaggcgcgccgcgccggcgcaggcgcagagaggcgcgccgcgccggcgcaggcgcaga
caCATGCTAGCGCGTCGGGGTGGAGGCgtggcgcaggcgcagagaggcgcgccgcgccgg
cgcaggcgcagagacaCATGCTACCGCGTCCAGGGGTGGAGGCgtggcgcaggcgcagag
aggcgcaccgcgccggcgcaggcgcagagacaCATGCTAGCGCGTCCAGGGGTGGAGGCG
TggcgcaggcgcagagacgcAAGCCTAcgggcgggggttgggggggcgTGTGTTGCAGGA
GCAAAGTCGCACGGCGCCGGGCTGGGGCGGGGGGAGGGTGGCGCCGTGCACGCGCAGAAA
CTCACGTCACGGTGGCGCGGCGCAGAGACGGGTAGAACCTCAGTAATCCGAAAAGCCGGG
ATCGACCGCCCCTTGCTTGCAGCCGGGCACTACAGGACCCGCTTGCTCACGGTGCTGTGC
CAGGGCGCCCCCTGCTGGCGACTAGGGCAACTGCAGGGCTCTCTTGCTTAGAGTGGTGGC
CAGCGCCCCCTGCTGGCGCCGGGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGG
CACGCCGCCTGCTGGCAGCTAGGGACATTGCAGGGTCCTCTTGCTCAAGGTGTAGTGGCA
GCACGCCCACCTGCTGGCAGCTGGGGACACTGCCGGGCCCTCTTGCTCCAACAGTACTGG
CGGATTATAGGGAAACACCCGGAGCATATGCTGTTTGGTCTCAGTAGACTCCTAAATATG
GGATTCCTgggtttaaaagtaaaaaataaatatgtttaatttgtGAACTGATTACCATCA
GAATTGTACTGTTCTGTATCCCACCAGCAATGTCTAGGAATGCCTGTTTCTCCACAAAGT
GTTtacttttggatttttgccagTCTAACAGGTGAAGCCCTGGAGATTCTTATTAGTGAT
TTGGGCTGGGGCCTGgccatgtgtatttttttaaatttccactgaTGATTTTGCTGCATG
GCCGGTGTTGAGAATGACTGCGCAAATTTGCCGGATTTCCTTTGCTGTTCCTGCATGTAG
TTTAAACGAGATTGCCAGCACCGGGTATCATTCACCATTTTTCTTTTCGTTAACTTGCCG
TCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCAGA
CTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTT
CATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCAC
TGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCAT
< /code>
s
Code: Select all
['ATGACGAACACAAAGGGAAGGAGGAGAGGCACGCGATATATGTTCTCCAGACCTTTTAGAAAACACGGAGTTGTTCCTTTGGCCACATATATGCGAATCTATAAGAAAGGTGACATTGTAGGCATCAAGGGAATGCATACTGTTGAAAAAGGAATGCCCGCAAGTGTTACCATGGCAAAACTGGAAGAGCCTACAATGTTCCCCAGCACGCTCTTACGTTGTTGTTAAGGGCAAGATTCTCGCCAAGAGGATTAACGTGCGTATTGAGCACATTAAGCACTCTAAGAGCTGAGATGGCTTCCTGAAACGCGTGAAGGAAAATGATAAGATAAAGAAAGACGCCGAAGAGAAAGGTACCTGGGTTCAATTGAAGCGCCAGCCTGCTCCACCCAGAGAAGCACACTGTGTGAGAACCAATGGGAAGGAGCCTGAGCTGCTGGAACCTCTTCCCTATGAATTCATGGCC',
'ATGGGCAAGTTCATAAAACCTGGGAAAGTAGTGTTGGTCCAGGCCAGACACTACACCGGATGCTACTCTGGATGCAAAACCATCATCGTGAAGAACATTGATGATGGCACCTTAGAATGCCCCGTCAGCTGTTCTCTGGTGGCTGGAATTGACTGTTATCCTTGCAAGGTGACAGCTGCCATGGGCAAGAAGAGCACCCAGAGGTCAAAGACCAAGTCTTTTGTGAAAGTTTATAACTACAATCATCTCATGCCCACAAGGCACTCTGTGGATACCCCCTTGGACAAAACTGTCATCAACAAGGATGTCTTCAGAGACCCTGCTCTTAAACACAAGGCCCAAAGGAAGGCCAAAGTCAAAATCAAAGAGAGGTAAAACCTGGGCAAGAACAAGTGGCTCTTCCAAAAGCTGTGGTTT',
'ATGGTGCCGAAAGTGAAGAAGGAAGCTCCTGCCCCTCCTAAAGCCGAAGCCAAAGCGAAGGCTTTAAAGGCCAAGAAGGCAGTGTTGAAAGGTGTCCACAGCCACAAAAAGAAGATCCACACGTCACCCACCTTCCGGCGGCCGAAGACACTGCGACTCCGGAGACAGCCCAAATATCCTCGGAAGAGCGCTCCCAGGAGAAACAAGCTTGACCACTATGCTATCATCAAGTTTCCGCTGACCACTGAGTCTGCCATGAAGAAGATAGAAGACAACAACACACTTGTGTTCATTGTGGATGTTAAAGCCAACAAGCACCAGATCAAACAGGCTGTGAAGAAGCTCTATGACATTGATGTGGCCAAGGTCAACACCCTGATTCGGCCTGATGGAGAGAAGAAGGTATATGTTCGACTGACTCCTGATTACGATGCTTTGGATGTTGCCAACAAAATTGGGATTATC']