Compare commits
31 Commits
119 changed files with 50262 additions and 1831 deletions
@ -0,0 +1,14 @@ |
|||||||
|
# https://docs.travis-ci.com/user/languages/go/ |
||||||
|
language: go |
||||||
|
go: |
||||||
|
- 1.10.x |
||||||
|
- 1.11.x |
||||||
|
- tip |
||||||
|
|
||||||
|
install: true |
||||||
|
|
||||||
|
script: |
||||||
|
- go test -v ./rosalind/... |
||||||
|
- go test -v ./chapter1/... |
||||||
|
- go test -v ./chapter2/... |
||||||
|
- go test -v ./chapter3/... |
@ -0,0 +1,19 @@ |
|||||||
|
Copyright 2019 Charles Reid |
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy of |
||||||
|
this software and associated documentation files (the "Software"), to deal in |
||||||
|
the Software without restriction, including without limitation the rights to |
||||||
|
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies |
||||||
|
of the Software, and to permit persons to whom the Software is furnished to do |
||||||
|
so, subject to the following conditions: |
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included in all |
||||||
|
copies or substantial portions of the Software. |
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
||||||
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
||||||
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
||||||
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
||||||
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
||||||
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
||||||
|
SOFTWARE. |
@ -1,41 +1,126 @@ |
|||||||
# Go-Rosalind |
# go-rosalind |
||||||
|
|
||||||
Solving problems from Rosalind.info using Go |
`rosalind` is a Go (golang) package for solving bioinformatics problems. |
||||||
|
|
||||||
## Organization |
[![travis](https://img.shields.io/travis/charlesreid1/go-rosalind.svg)](https://travis-ci.org/charlesreid1/go-rosalind.svg) |
||||||
|
[![golang](https://img.shields.io/badge/language-golang-00ADD8.svg)](https://golang.org) |
||||||
|
[![license](https://img.shields.io/github/license/charlesreid1/go-rosalind.svg)](https://github.com/charlesreid1/go-rosalind/blob/master/LICENSE) |
||||||
|
[![godoc](https://godoc.org/github.com/charlesreid1/go-rosalind?status.svg)](http://godoc.org/github.com/charlesreid1/go-rosalind) |
||||||
|
|
||||||
|
## Summary |
||||||
|
|
||||||
Each chapter has its own directory. |
This repo contains a Go (golang) library, `rosalind`, that implements |
||||||
|
functionality for solving bioinformatics problems. This is mainly |
||||||
|
useful for problems on Rosalind.info but is for general use as well. |
||||||
|
|
||||||
Within the chapter directory, each problem has |
Rosalind problems are grouped by chapter. Each problem has its own |
||||||
its own driver program, which prints info about |
function and is implemented in a library called `chapter1`, `chapter2`, |
||||||
the problem, loads the input file from Rosalind, |
etc. |
||||||
and prints the solution. Each problem also has |
|
||||||
its own test suite using the examples provided |
|
||||||
on Rosalind.info. |
|
||||||
|
|
||||||
For example, the function that loads the |
For example, Chapter 1 question A is implemented in package |
||||||
input file for problem BA1A is in `ba1a.go` |
`chapter1` as the function `BA1a( <input-file-name> )`. |
||||||
and the code to test the functionality |
This (specific) functionality wraps the (general purpose) |
||||||
of the solution to BA1A is in `ba1a_test.go`. |
`rosalind` library. |
||||||
|
|
||||||
## Quick Start |
## Quick Start |
||||||
|
|
||||||
To run all the tests in a chapter directory: |
### Rosalind |
||||||
|
|
||||||
|
The `rosalind` library can be installed using `go get`: |
||||||
|
|
||||||
|
``` |
||||||
|
go get https://github.com/charlesreid1/go-rosalind/rosalind |
||||||
|
``` |
||||||
|
|
||||||
|
The library can now be imported and its functions called directly. |
||||||
|
Here is a brief example: |
||||||
|
|
||||||
|
``` |
||||||
|
package main |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
func main() { |
||||||
|
input := "AAAATGCGCTAGTAAAAGTCACTGAAAA" |
||||||
|
k := 4 |
||||||
|
result, _ := rosalind.MostFrequentKmers(input, k) |
||||||
|
fmt.Println(result) |
||||||
|
} |
||||||
``` |
``` |
||||||
go test -v |
|
||||||
|
### Problem Sets |
||||||
|
|
||||||
|
Each set of problems is grouped into its own package. These |
||||||
|
packages import the `rosalind` package, so it should be |
||||||
|
available. |
||||||
|
|
||||||
|
You can install the Chapter 1 problem set, for example, like so: |
||||||
|
|
||||||
|
``` |
||||||
|
go get https://github.com/charlesreid1/go-rosalind/chapter1 |
||||||
|
``` |
||||||
|
|
||||||
|
This can now be imported and used in any Go program. |
||||||
|
|
||||||
|
Try creating a `main.go` file in a temporary directory, |
||||||
|
and run it with `go run main.go`: |
||||||
|
|
||||||
``` |
``` |
||||||
|
package main |
||||||
|
|
||||||
To run only a particular problem: |
import ( |
||||||
|
rch1 "github.com/charlesreid1/go-rosalind/chapter1" |
||||||
|
) |
||||||
|
|
||||||
1. Edit `main.go` to call the right method |
func main() { |
||||||
for the right problem with the right input |
filename := "rosalind_ba1a.txt" |
||||||
file name. |
rch1.BA1a(filename) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
2. Run `main.go` using `go run`, and point Go |
Assuming an input file `rosalind_ba1a.txt` is available, |
||||||
to all the relevant Go files: |
you should see a problem description and the output of |
||||||
|
the problem, which can be copied and pasted into |
||||||
|
Rosalind.info: |
||||||
|
|
||||||
``` |
``` |
||||||
go run main.go utils.go rosalind.go <name-of-BA-file> |
$ go run main.go |
||||||
|
|
||||||
|
----------------------------------------- |
||||||
|
Rosalind: Problem BA1a: |
||||||
|
Most Frequest k-mers |
||||||
|
|
||||||
|
Given an input string and a length k, |
||||||
|
report the k-mer or k-mers that occur |
||||||
|
most frequently. |
||||||
|
|
||||||
|
URL: http://rosalind.info/problems/ba1a/ |
||||||
|
|
||||||
|
|
||||||
|
Computed result from input file: for_real/rosalind_ba1a.txt |
||||||
|
39 |
||||||
``` |
``` |
||||||
|
|
||||||
|
## Command Line Interface |
||||||
|
|
||||||
|
TBA |
||||||
|
|
||||||
|
## Organization |
||||||
|
|
||||||
|
The repo contains the following directories: |
||||||
|
|
||||||
|
* `rosalind/` - code and functions for the Rosalind library |
||||||
|
|
||||||
|
* `chapter1/` - solutions to chapter 1 questions (utilizes `rosalind` library) |
||||||
|
|
||||||
|
* `chapter2/` - solutions to chapter 2 questions |
||||||
|
|
||||||
|
* `chapter3/` - solutions to chapter 3 questions |
||||||
|
|
||||||
|
* `stronghold/` - solutions to questions from the stronghold section of Rosalind.info |
||||||
|
|
||||||
|
See the Readme file in each respective directory for more info. |
||||||
|
|
||||||
|
@ -1,73 +0,0 @@ |
|||||||
# Chapter 1 |
|
||||||
|
|
||||||
In this chapter we perform basic operations with |
|
||||||
strings and data structures. |
|
||||||
|
|
||||||
## How to run |
|
||||||
|
|
||||||
* Each problem has its own function |
|
||||||
|
|
||||||
* To run the code for a particular problem, |
|
||||||
call the function for that problem in `main.go` |
|
||||||
|
|
||||||
* Edit `main.go` to call the right function, |
|
||||||
and pass in the name of the input file you |
|
||||||
want to use: for example, `BA1A("input.txt")` |
|
||||||
|
|
||||||
* The function you call is implemented in the |
|
||||||
corresponding Go file (for example, `ba1a.go`). |
|
||||||
It loads the inputs from the input file, |
|
||||||
calls the right function with the inputs, |
|
||||||
and prints the results. |
|
||||||
|
|
||||||
* The functions that load data from input files |
|
||||||
are tested along with the functions themselves, |
|
||||||
since each problem has a sample input file |
|
||||||
in `data/` |
|
||||||
|
|
||||||
## Directory Layout |
|
||||||
|
|
||||||
* Each problem has one Go file and one test |
|
||||||
|
|
||||||
* The `data/` directory contains input files |
|
||||||
for the tests (i.e., files that contain both |
|
||||||
inputs and corresponding outputs) |
|
||||||
|
|
||||||
* The `for_real/` directory contains sample |
|
||||||
input files from Rosalind.info for each |
|
||||||
problem (i.e., files that contain only the |
|
||||||
inputs) |
|
||||||
|
|
||||||
* The `main.go` file contains the `main()` |
|
||||||
driver function and is the entrypoint for |
|
||||||
`go run` |
|
||||||
|
|
||||||
* The `rosalind.go` file contains most of the |
|
||||||
computational functionality implemented |
|
||||||
for the problems. |
|
||||||
|
|
||||||
* The `utils.go` file contains utilties unrelated |
|
||||||
to bioinformatics. |
|
||||||
|
|
||||||
## Compiling and Running |
|
||||||
|
|
||||||
To run all tests, `go test`: |
|
||||||
|
|
||||||
``` |
|
||||||
go test -v |
|
||||||
``` |
|
||||||
|
|
||||||
To run a specific problem, edit `main.go` |
|
||||||
to call the corresponding problem's function |
|
||||||
and then `go run`: |
|
||||||
|
|
||||||
``` |
|
||||||
go run main.go utils.go rosalind.go <name of ba1 file.go> |
|
||||||
``` |
|
||||||
|
|
||||||
## To Do |
|
||||||
|
|
||||||
Add a Snakefile |
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1,54 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1A: Most Frequent k-mers
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1ADescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1A:", |
|
||||||
"Most Frequest k-mers", |
|
||||||
"", |
|
||||||
"Given an input string and a length k,", |
|
||||||
"report the k-mer or k-mers that occur", |
|
||||||
"most frequently.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1a/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem,
|
|
||||||
// print the name of the input file,
|
|
||||||
// print the output/result
|
|
||||||
func BA1A(filename string) { |
|
||||||
|
|
||||||
BA1ADescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
var input, pattern string |
|
||||||
input = lines[0] |
|
||||||
pattern = lines[1] |
|
||||||
|
|
||||||
result := PatternCount(input, pattern) |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(result) |
|
||||||
} |
|
||||||
|
|
@ -1,99 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
"strconv" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
// To run this test:
|
|
||||||
//
|
|
||||||
// $ go test -v -run TestPatternCount
|
|
||||||
|
|
||||||
// Run a single test of the PatternCount function
|
|
||||||
func TestPatternCount(t *testing.T) { |
|
||||||
// Call the PatternCount function
|
|
||||||
input := "GCGCG" |
|
||||||
pattern := "GCG" |
|
||||||
result := PatternCount(input,pattern) |
|
||||||
gold := 2 |
|
||||||
if result != gold { |
|
||||||
err := fmt.Sprintf("Error testing PatternCount(): input = %s, pattern = %s, result = %d (should be %d)", |
|
||||||
input, pattern, result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Run a test matrix of the PatternCount function
|
|
||||||
func TestMatrixPatternCount(t *testing.T) { |
|
||||||
// Construct a test matrix
|
|
||||||
var tests = []struct { |
|
||||||
input string |
|
||||||
pattern string |
|
||||||
gold int |
|
||||||
}{ |
|
||||||
{"GCGCG", "GCG", 2}, |
|
||||||
{"GAGGGGGGGAG", "AGG", 1}, |
|
||||||
{"GCACGCACGCAC", "GCAC", 3}, |
|
||||||
{"", "GC", 0}, |
|
||||||
{"GCG", "GTACTCTC", 0}, |
|
||||||
{"ACGTACGTACGT", "CG", 3}, |
|
||||||
{"AAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAATAATTACAGAGTACACAACATCCA", |
|
||||||
"AAA", 4}, |
|
||||||
{"AGCGTGCCGAAATATGCCGCCAGACCTGCTGCGGTGGCCTCGCCGACTTCACGGATGCCAAGTGCATAGAGGAAGCGAGCAAAGGTGGTTTCTTTCGCTTTATCCAGCGCGTTAACCACGTTCTGTGCCGACTTT", |
|
||||||
"TTT", 4}, |
|
||||||
{"GGACTTACTGACGTACG","ACT", 2}, |
|
||||||
{"ATCCGATCCCATGCCCATG","CC", 5}, |
|
||||||
{"CTGTTTTTGATCCATGATATGTTATCTCTCCGTCATCAGAAGAACAGTGACGGATCGCCCTCTCTCTTGGTCAGGCGACCGTTTGCCATAATGCCCATGCTTTCCAGCCAGCTCTCAAACTCCGGTGACTCGCGCAGGTTGAGT", |
|
||||||
"CTC", 9}, |
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
result := PatternCount(test.input, test.pattern) |
|
||||||
if result != test.gold { |
|
||||||
err := fmt.Sprintf("Error testing PatternCount(): input = %s, pattern = %s, result = %d (should be %d)", |
|
||||||
test.input, test.pattern, result, test.gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Load a PatternCount test (input and output)
|
|
||||||
// from a file. Run the test with the input
|
|
||||||
// and verify the output matches the output
|
|
||||||
// contained in the file.
|
|
||||||
func TestPatternCountFile(t *testing.T) { |
|
||||||
|
|
||||||
filename := "data/pattern_count.txt" |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// lines[0]: Input
|
|
||||||
input := lines[1] |
|
||||||
pattern := lines[2] |
|
||||||
|
|
||||||
// lines[3]: Output
|
|
||||||
output_str := lines[4] |
|
||||||
|
|
||||||
// Convert output to inteter
|
|
||||||
output,err := strconv.Atoi(output_str) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Call the function with the given inputs
|
|
||||||
result := PatternCount(input, pattern) |
|
||||||
|
|
||||||
// Verify answer
|
|
||||||
if result != output { |
|
||||||
err := fmt.Sprintf("Error testing PatternCount using test case from file: results do not match:\rcomputed result = %d\nexpected output = %d",result,output) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,58 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
"strings" |
|
||||||
"strconv" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1B: Most Frequent k-mers
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1BDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1B:", |
|
||||||
"Most Frequest k-mers", |
|
||||||
"", |
|
||||||
"Given an input string and a length k,", |
|
||||||
"report the k-mer or k-mers that occur", |
|
||||||
"most frequently.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1b/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1B(filename string) { |
|
||||||
|
|
||||||
BA1BDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
input := lines[0] |
|
||||||
k_str := lines[1] |
|
||||||
|
|
||||||
k,err := strconv.Atoi(k_str) |
|
||||||
if err!=nil { |
|
||||||
log.Fatalf("Error: string to int conversion: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
mfks,_ := MostFrequentKmers(input,k) |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(strings.Join(mfks," ")) |
|
||||||
} |
|
||||||
|
|
@ -1,82 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"sort" |
|
||||||
"strconv" |
|
||||||
"strings" |
|
||||||
"log" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
// Run a test of the MostFrequentKmers function
|
|
||||||
func TestMostFrequentKmers(t *testing.T) { |
|
||||||
// Call MostFrequentKmers
|
|
||||||
input := "AAAATGCGCTAGTAAAAGTCACTGAAAA" |
|
||||||
k := 4 |
|
||||||
result,err := MostFrequentKmers(input,k) |
|
||||||
gold := []string{"AAAA"} |
|
||||||
|
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
if !EqualStringSlices(result,gold) { |
|
||||||
err := fmt.Sprintf("Error testing MostFrequentKmers(): input = %s, k = %d, result = %s (should be %s)", |
|
||||||
input, k, result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Run a test of the PatternCount function
|
|
||||||
// using inputs/outputs from a file.
|
|
||||||
func TestMostFrequentKmersFile(t *testing.T) { |
|
||||||
|
|
||||||
filename := "data/frequent_words.txt" |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// lines[0]: Input
|
|
||||||
dna := lines[1] |
|
||||||
k_str := lines[2] |
|
||||||
// lines[3]: Output
|
|
||||||
gold := strings.Split(lines[4]," ") |
|
||||||
|
|
||||||
// Convert k to integer
|
|
||||||
k,err := strconv.Atoi(k_str) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Call the function with the given inputs
|
|
||||||
result, err := MostFrequentKmers(dna,k) |
|
||||||
|
|
||||||
// Check if function threw error
|
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Check that there _was_ a result
|
|
||||||
if len(result)==0 { |
|
||||||
err := fmt.Sprintf("Error testing MostFrequentKmers using test case from file: length of most frequent kmers found was 0: %q", |
|
||||||
result) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Sort before comparing
|
|
||||||
sort.Strings(gold) |
|
||||||
sort.Strings(result) |
|
||||||
|
|
||||||
// These will only be unequal if something went wrong
|
|
||||||
if !EqualStringSlices(gold,result) { |
|
||||||
err := fmt.Sprintf("Error testing MostFrequentKmers using test case from file: most frequent kmers mismatch.\ncomputed = %q\ngold = %q\n", |
|
||||||
result,gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,50 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1C: Find the Reverse Complement of a String
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1CDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1C:", |
|
||||||
"Find the Reverse Complement of a String", |
|
||||||
"", |
|
||||||
"Given a DNA input string,", |
|
||||||
"find the reverse complement", |
|
||||||
"of the DNA string.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1c/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1C(filename string) { |
|
||||||
|
|
||||||
BA1CDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
input := lines[0] |
|
||||||
|
|
||||||
result,_ := ReverseComplement(input) |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(result) |
|
||||||
} |
|
||||||
|
|
@ -1,123 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
// Check that the DNA2Bitmasks utility
|
|
||||||
// extracts the correct bitmasks from
|
|
||||||
// a DNA input string.
|
|
||||||
func TestDNA2Bitmasks(t *testing.T) { |
|
||||||
|
|
||||||
input := "AATCCGCT" |
|
||||||
|
|
||||||
result, func_err := DNA2Bitmasks(input) |
|
||||||
|
|
||||||
// Handle errors from in the DNA2Bitmasks function
|
|
||||||
if func_err != nil { |
|
||||||
err := fmt.Sprintf("Error in function DNA2Bitmasks(): input = %s", input) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Assemble gold standard answer (bitvectors)
|
|
||||||
tt := true |
|
||||||
ff := false |
|
||||||
gold := make(map[string][]bool) |
|
||||||
gold["A"] = []bool{tt,tt,ff,ff,ff,ff,ff,ff} |
|
||||||
gold["T"] = []bool{ff,ff,tt,ff,ff,ff,ff,tt} |
|
||||||
gold["C"] = []bool{ff,ff,ff,tt,tt,ff,tt,ff} |
|
||||||
gold["G"] = []bool{ff,ff,ff,ff,ff,tt,ff,ff} |
|
||||||
|
|
||||||
// Verify result from DNA2Bitmasks is same as
|
|
||||||
// our gold standard
|
|
||||||
for _,cod := range "ATCG" { |
|
||||||
cods := string(cod) |
|
||||||
if !EqualBoolSlices(result[cods],gold[cods]) { |
|
||||||
err := fmt.Sprintf("Error testing DNA2Bitmasks(): input = %s, codon = %s, extracted = %v, gold = %v", |
|
||||||
input, cods, result[cods], gold[cods]) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Check that the Bitmasks2DNA utility
|
|
||||||
// constructs the correct DNA string
|
|
||||||
// from bitmasks.
|
|
||||||
func TestBitmasks2DNA(t *testing.T) { |
|
||||||
// Assemble input bitmasks
|
|
||||||
tt := true |
|
||||||
ff := false |
|
||||||
input := make(map[string][]bool) |
|
||||||
input["A"] = []bool{tt,tt,ff,ff,ff,ff,ff,ff} |
|
||||||
input["T"] = []bool{ff,ff,tt,ff,ff,ff,ff,tt} |
|
||||||
input["C"] = []bool{ff,ff,ff,tt,tt,ff,tt,ff} |
|
||||||
input["G"] = []bool{ff,ff,ff,ff,ff,tt,ff,ff} |
|
||||||
|
|
||||||
gold := "AATCCGCT" |
|
||||||
|
|
||||||
result, func_err := Bitmasks2DNA(input) |
|
||||||
|
|
||||||
// Handle errors from in the DNA2Bitmasks function
|
|
||||||
if func_err != nil { |
|
||||||
err := fmt.Sprintf("Error in function Bitmasks2DNA(): function returned error") |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Verify result from DNA2Bitmasks is same as
|
|
||||||
// our gold standard
|
|
||||||
if result != gold { |
|
||||||
err := fmt.Sprintf("Error testing Bitmasks2DNA(): result = %s, gold = %s", result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Run a test of the function that computes
|
|
||||||
// the ReverseComplement of a DNA string.
|
|
||||||
func TestReverseComplement(t *testing.T) { |
|
||||||
input := "AAAACCCGGT" |
|
||||||
result,_ := ReverseComplement(input) |
|
||||||
gold := "ACCGGGTTTT" |
|
||||||
if result!=gold { |
|
||||||
err := fmt.Sprintf("Error testing ReverseComplement(): input = %s, result = %s (should be %s)", |
|
||||||
input, result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Run a test of the ReverseComplement function
|
|
||||||
// using inputs/outputs from a file.
|
|
||||||
func TestReverseComplementFile(t *testing.T) { |
|
||||||
|
|
||||||
filename := "data/reverse_complement.txt" |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// lines[0]: Input
|
|
||||||
input := lines[1] |
|
||||||
// lines[2]: Output
|
|
||||||
gold := lines[3] |
|
||||||
|
|
||||||
// Call the function with the given inputs
|
|
||||||
result, err := ReverseComplement(input) |
|
||||||
|
|
||||||
// Check that there _was_ a result
|
|
||||||
if len(result)==0 { |
|
||||||
err := fmt.Sprintf("Error testing ReverseComplement using test case from file") |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
if result!=gold { |
|
||||||
err := fmt.Sprintf("Error testing ReverseComplement(): input = %s, result = %s (should be %s)", |
|
||||||
input, result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,61 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"strconv" |
|
||||||
"strings" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1D: Find all occurrences of pattern in string
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1DDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1D:", |
|
||||||
"Find all occurrences of pattern in string", |
|
||||||
"", |
|
||||||
"Given a string input (genome) and a substring (pattern),", |
|
||||||
"return all starting positions in the genome where the", |
|
||||||
"pattern occurs in the genome.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1d/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1D(filename string) { |
|
||||||
|
|
||||||
BA1DDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
pattern := lines[0] |
|
||||||
genome := lines[1] |
|
||||||
|
|
||||||
// Result is a slice of ints
|
|
||||||
locs,_ := FindOccurrences(pattern,genome) |
|
||||||
|
|
||||||
// Convert to a slice of strings for easier printing
|
|
||||||
locs_str := make([]string,len(locs)) |
|
||||||
for i,j := range locs { |
|
||||||
locs_str[i] = strconv.Itoa(j) |
|
||||||
} |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(strings.Join(locs_str," ")) |
|
||||||
} |
|
||||||
|
|
@ -1,97 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
"strings" |
|
||||||
"strconv" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
func TestFindOccurrences(t *testing.T) { |
|
||||||
// Call FindOccurrences
|
|
||||||
pattern := "ATAT" |
|
||||||
genome := "GATATATGCATATACTT" |
|
||||||
|
|
||||||
result,err := FindOccurrences(pattern,genome) |
|
||||||
gold := []int{1,3,9} |
|
||||||
|
|
||||||
if !EqualIntSlices(result,gold) || err!=nil { |
|
||||||
err := fmt.Sprintf("Error testing FindOccurrences(): result = %q, should be %q", |
|
||||||
result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
func TestFindOccurrencesDebug(t *testing.T) { |
|
||||||
// Construct a test matrix
|
|
||||||
var tests = []struct { |
|
||||||
pattern string |
|
||||||
genome string |
|
||||||
gold []int |
|
||||||
}{ |
|
||||||
{"ACAC", "TTTTACACTTTTTTGTGTAAAAA", |
|
||||||
[]int{4}}, |
|
||||||
{"AAA", "AAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAATAATTACAGAGTACACAACATCCAT", |
|
||||||
[]int{0,46,51,74}}, |
|
||||||
{"TTT", "AGCGTGCCGAAATATGCCGCCAGACCTGCTGCGGTGGCCTCGCCGACTTCACGGATGCCAAGTGCATAGAGGAAGCGAGCAAAGGTGGTTTCTTTCGCTTTATCCAGCGCGTTAACCACGTTCTGTGCCGACTTT", |
|
||||||
[]int{88,92,98,132}}, |
|
||||||
{"ATA", "ATATATA", |
|
||||||
[]int{0,2,4}}, |
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
|
|
||||||
result,err := FindOccurrences(test.pattern, test.genome) |
|
||||||
|
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
if !EqualIntSlices(result,test.gold) { |
|
||||||
err := fmt.Sprintf("Error testing FindOccurrences(): result = %q, should be %q", |
|
||||||
result, test.gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
func TestFindOccurrencesFiles(t *testing.T) { |
|
||||||
|
|
||||||
filename := "data/pattern_matching.txt" |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// lines[0]: Input
|
|
||||||
pattern := lines[1] |
|
||||||
genome := lines[2] |
|
||||||
|
|
||||||
// lines[3]: Output
|
|
||||||
gold_str := lines[4] |
|
||||||
gold_slice := strings.Split(gold_str," ") |
|
||||||
|
|
||||||
gold := make([]int,len(gold_slice)) |
|
||||||
for i,g := range gold_slice { |
|
||||||
gold[i],err = strconv.Atoi(g) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
result,err := FindOccurrences(pattern,genome) |
|
||||||
|
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
if !EqualIntSlices(result,gold) { |
|
||||||
err := fmt.Sprintf("Error testing FindOccurrences():\nresult = %v\ngold = %v\n", |
|
||||||
result, gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,58 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
"strings" |
|
||||||
"strconv" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1E: Find patterns forming clumps in a string
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1EDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1E:", |
|
||||||
"Find patterns forming clumps in a string", |
|
||||||
"", |
|
||||||
"A clump is characterized by integers L and t", |
|
||||||
"if there is an interval in the genome of length L", |
|
||||||
"in which a given pattern occurs t or more times.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1e/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1E(filename string) { |
|
||||||
|
|
||||||
BA1EDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
genome := lines[0] |
|
||||||
params_str := lines[1] |
|
||||||
params_slice := strings.Split(params_str," ") |
|
||||||
|
|
||||||
k,_ := strconv.Atoi(params_slice[0]) |
|
||||||
L,_ := strconv.Atoi(params_slice[1]) |
|
||||||
t,_ := strconv.Atoi(params_slice[2]) |
|
||||||
|
|
||||||
patterns,_ := FindClumps(genome,k,L,t) |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(strings.Join(patterns," ")) |
|
||||||
} |
|
||||||
|
|
@ -1,42 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
func TestMatrixFindClumps(t *testing.T) { |
|
||||||
var tests = []struct { |
|
||||||
genome string |
|
||||||
k int |
|
||||||
L int |
|
||||||
t int |
|
||||||
gold []string |
|
||||||
}{ |
|
||||||
{"CGGACTCGACAGATGTGAAGAACGACAATGTGAAGACTCGACACGACAGAGTGAAGAGAAGAGGAAACATTGTAA", |
|
||||||
5, 50, 4, |
|
||||||
[]string{"CGACA","GAAGA"}}, |
|
||||||
{"AAAACGTCGAAAAA", |
|
||||||
2, 4, 2, |
|
||||||
[]string{"AA"}}, |
|
||||||
{"ACGTACGT", |
|
||||||
1, 5, 2, |
|
||||||
[]string{"A","C","G","T"}}, |
|
||||||
{"CCACGCGGTGTACGCTGCAAAAAGCCTTGCTGAATCAAATAAGGTTCCAGCACATCCTCAATGGTTTCACGTTCTTCGCCAATGGCTGCCGCCAGGTTATCCAGACCTACAGGTCCACCAAAGAACTTATCGATTACCGCCAGCAACAATTTGCGGTCCATATAATCGAAACCTTCAGCATCGACATTCAACATATCCAGCG", |
|
||||||
3, 25, 3, |
|
||||||
[]string{"AAA","CAG","CAT","CCA","GCC","TTC"}}, |
|
||||||
|
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
result,err := FindClumps(test.genome, |
|
||||||
test.k, test.L, test.t) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
if !EqualStringSlices(result,test.gold) { |
|
||||||
err := fmt.Sprintf("Error testing FindClumps(): k = %d, L = %d, t = %d",test.k,test.L,test.t) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,60 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"strings" |
|
||||||
"strconv" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1F: Find positions in a gene that minimizing skew
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1FDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1F:", |
|
||||||
"Find positions in a gene that minimize skew", |
|
||||||
"", |
|
||||||
"The skew of a genome is defined as the difference", |
|
||||||
"between the number of C codons and the number of G", |
|
||||||
"codons. Given a DNA string, this function should", |
|
||||||
"compute the cumulative skew for each position in", |
|
||||||
"the genome, and report the indices where the skew", |
|
||||||
"value is minimzed.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1f/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1F(filename string) { |
|
||||||
|
|
||||||
BA1FDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
genome := lines[0] |
|
||||||
|
|
||||||
minskew,_ := MinSkewPositions(genome) |
|
||||||
|
|
||||||
minskew_str := make([]string,len(minskew)) |
|
||||||
for i,j := range minskew { |
|
||||||
minskew_str[i] = strconv.Itoa(j) |
|
||||||
} |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(strings.Join(minskew_str," ")) |
|
||||||
} |
|
||||||
|
|
@ -1,53 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"sort" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
func TestMatrixMinSkewPosition(t *testing.T) { |
|
||||||
var tests = []struct { |
|
||||||
genome string |
|
||||||
gold []int |
|
||||||
}{ |
|
||||||
{"CCTATCGGTGGATTAGCATGTCCCTGTACGTTTCGCCGCGAACTAGTTCACACGGCTTGATGGCAAATGGTTTTTCCGGCGACCGTAATCGTCCACCGAG", |
|
||||||
[]int{53, 97}}, |
|
||||||
{"TAAAGACTGCCGAGAGGCCAACACGAGTGCTAGAACGAGGGGCGTAAACGCGGGTCCGA", |
|
||||||
[]int{11, 24}}, |
|
||||||
{"ACCG", |
|
||||||
[]int{3}}, |
|
||||||
{"ACCC", |
|
||||||
[]int{4}}, |
|
||||||
{"CCGGGT", |
|
||||||
[]int{2}}, |
|
||||||
{"CCGGCCGG", |
|
||||||
[]int{2,6}}, |
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
|
|
||||||
// Do it - find the positions that minimize skew
|
|
||||||
result,err := MinSkewPositions(test.genome) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Check length of result
|
|
||||||
if len(result)!=len(test.gold) { |
|
||||||
err := fmt.Sprintf("Error testing MinSkewPositions():\nfor genome: %s\nlength of result (%d) did not match length of gold standard (%d).\nFound: %v\nShould be: %v", |
|
||||||
test.genome, len(result), len(test.gold), |
|
||||||
result, test.gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Sort before comparing
|
|
||||||
sort.Ints(result) |
|
||||||
sort.Ints(test.gold) |
|
||||||
if !EqualIntSlices(result,test.gold) { |
|
||||||
err := fmt.Sprintf("Error testing MinSkewPositions():\nfor genome: %s\nfound: %v\nshould be: %v", |
|
||||||
test.genome, result, test.gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,52 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1G: Find Hamming distance between two DNA strings
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1GDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1G:", |
|
||||||
"Find Hamming distance between two DNA strings", |
|
||||||
"", |
|
||||||
"The Hamming distance between two strings HammingDistance(p,q)", |
|
||||||
"is the number of characters different between the two", |
|
||||||
"strands. This program computes the Hamming distance", |
|
||||||
"between two strings.", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1g/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1G(filename string) { |
|
||||||
|
|
||||||
BA1GDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
p := lines[0] |
|
||||||
q := lines[1] |
|
||||||
|
|
||||||
hamm,_ := HammingDistance(p,q) |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(hamm) |
|
||||||
} |
|
||||||
|
|
@ -1,49 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
func TestMatrixHammingDistance(t *testing.T) { |
|
||||||
var tests = []struct { |
|
||||||
p string |
|
||||||
q string |
|
||||||
dist int |
|
||||||
}{ |
|
||||||
{"GGGCCGTTGGT", |
|
||||||
"GGACCGTTGAC", |
|
||||||
3 }, |
|
||||||
{"AAAA", |
|
||||||
"TTTT", |
|
||||||
4 }, |
|
||||||
{"ACGTACGT", |
|
||||||
"TACGTACG", |
|
||||||
8 }, |
|
||||||
{"ACGTACGT", |
|
||||||
"CCCCCCCC", |
|
||||||
6 }, |
|
||||||
{"ACGTACGT", |
|
||||||
"TGCATGCA", |
|
||||||
8 }, |
|
||||||
{"GATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATAC", |
|
||||||
"AATAGCAGCTTCTCAACTGGTTACCTCGTATGAGTAAATTAGGTCATTATTGACTCAGGTCACTAACGTC", |
|
||||||
15 }, |
|
||||||
{"AGAAACAGACCGCTATGTTCAACGATTTGTTTTATCTCGTCACCGGGATATTGCGGCCACTCATCGGTCAGTTGATTACGCAGGGCGTAAATCGCCAGAATCAGGCTG", |
|
||||||
"AGAAACCCACCGCTAAAAACAACGATTTGCGTAGTCAGGTCACCGGGATATTGCGGCCACTAAGGCCTTGGATGATTACGCAGAACGTATTGACCCAGAATCAGGCTC", |
|
||||||
28 }, |
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
result,err := HammingDistance(test.p, test.q) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
if result!=test.dist { |
|
||||||
err := fmt.Sprintf("Error testing HammingDistance(): computed dist = %d (should be %d)\np = %s\nq = %s\n", |
|
||||||
result, test.dist, |
|
||||||
test.p, test.q) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,65 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"strconv" |
|
||||||
"strings" |
|
||||||
"log" |
|
||||||
) |
|
||||||
|
|
||||||
// Rosalind: Problem BA1H: Find approximate occurrences of pattern in string
|
|
||||||
|
|
||||||
// Describe the problem
|
|
||||||
func BA1HDescription() { |
|
||||||
description := []string{ |
|
||||||
"-----------------------------------------", |
|
||||||
"Rosalind: Problem BA1H:", |
|
||||||
"Find approximate occurrences of pattern in string", |
|
||||||
"", |
|
||||||
"Given a string Text and a string Pattern, and a maximum", |
|
||||||
"Hamming distance d, return all locations in Text where", |
|
||||||
"there is an approximate match with Pattern (i.e., a pattern", |
|
||||||
"with a Hamming distance from Pattern of d or less).", |
|
||||||
"", |
|
||||||
"URL: http://rosalind.info/problems/ba1h/", |
|
||||||
"", |
|
||||||
} |
|
||||||
for _, line := range description { |
|
||||||
fmt.Println(line) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// Describe the problem, and call the function
|
|
||||||
func BA1H(filename string) { |
|
||||||
|
|
||||||
BA1HDescription() |
|
||||||
|
|
||||||
// Read the contents of the input file
|
|
||||||
// into a single string
|
|
||||||
lines, err := readLines(filename) |
|
||||||
if err != nil { |
|
||||||
log.Fatalf("Error: readLines: %v",err) |
|
||||||
} |
|
||||||
|
|
||||||
// Input file contents
|
|
||||||
pattern := lines[0] |
|
||||||
text := lines[1] |
|
||||||
d_str := lines[2] |
|
||||||
|
|
||||||
d,_ := strconv.Atoi(d_str) |
|
||||||
|
|
||||||
approx,_ := FindApproximateOccurrences(pattern,text,d) |
|
||||||
|
|
||||||
approx_str := make([]string,len(approx)) |
|
||||||
for i,j := range approx { |
|
||||||
approx_str[i] = strconv.Itoa(j) |
|
||||||
if err!=nil { |
|
||||||
log.Fatalf("Error: conversion from int to string: %v",err) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
fmt.Println("") |
|
||||||
fmt.Printf("Computed result from input file: %s\n",filename) |
|
||||||
fmt.Println(strings.Join(approx_str," ")) |
|
||||||
} |
|
||||||
|
|
@ -1,56 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"testing" |
|
||||||
) |
|
||||||
|
|
||||||
func TestMatrixApproximateOccurrences(t *testing.T) { |
|
||||||
var tests = []struct { |
|
||||||
pattern string |
|
||||||
text string |
|
||||||
d int |
|
||||||
gold []int |
|
||||||
}{ |
|
||||||
{"ATTCTGGA", |
|
||||||
"CGCCCGAATCCAGAACGCATTCCCATATTTCGGGACCACTGGCCTCCACGGTACGGACGTCAATCAAATGCCTAGCGGCTTGTGGTTTCTCCTACGCTCC", |
|
||||||
3, |
|
||||||
[]int{6, 7, 26, 27, 78}}, |
|
||||||
{"AAA", |
|
||||||
"TTTTTTAAATTTTAAATTTTTT", |
|
||||||
2, |
|
||||||
[]int{4, 5, 6, 7, 8, 11, 12, 13, 14, 15}}, |
|
||||||
{"GAGCGCTGG", |
|
||||||
"GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT", |
|
||||||
2, |
|
||||||
[]int{0, 30, 66}}, |
|
||||||
{"AATCCTTTCA", |
|
||||||
"CCAAATCCCCTCATGGCATGCATTCCCGCAGTATTTAATCCTTTCATTCTGCATATAAGTAGTGAAGGTATAGAAACCCGTTCAAGCCCGCAGCGGTAAAACCGAGAACCATGATGAATGCACGGCGATTGCGCCATAATCCAAACA", |
|
||||||
3, |
|
||||||
[]int{3, 36, 74, 137}}, |
|
||||||
{"CCGTCATCC", |
|
||||||
"CCGTCATCCGTCATCCTCGCCACGTTGGCATGCATTCCGTCATCCCGTCAGGCATACTTCTGCATATAAGTACAAACATCCGTCATGTCAAAGGGAGCCCGCAGCGGTAAAACCGAGAACCATGATGAATGCACGGCGATTGC", |
|
||||||
3, |
|
||||||
[]int{0, 7, 36, 44, 48, 72, 79, 112}}, |
|
||||||
{"TTT", |
|
||||||
"AAAAAA", |
|
||||||
3, |
|
||||||
[]int{0, 1, 2, 3}}, |
|
||||||
{"CCA", |
|
||||||
"CCACCT", |
|
||||||
0, |
|
||||||
[]int{0}}, |
|
||||||
} |
|
||||||
for _, test := range tests { |
|
||||||
result,err := FindApproximateOccurrences(test.pattern, test.text, test.d) |
|
||||||
if err!=nil { |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
if !EqualIntSlices(result, test.gold) { |
|
||||||
err := fmt.Sprintf("Error testing FindApproximateOccurrences:\ncomputed = %v\ngold = %v", |
|
||||||
result, test.gold) |
|
||||||
t.Error(err) |
|
||||||
} |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
@ -1,15 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
) |
|
||||||
|
|
||||||
func main() { |
|
||||||
//BA1A("for_real/rosalind_ba1a.txt")
|
|
||||||
//BA1B("for_real/rosalind_ba1b.txt")
|
|
||||||
//BA1C("for_real/rosalind_ba1c.txt")
|
|
||||||
//BA1D("for_real/rosalind_ba1d.txt")
|
|
||||||
//BA1E("for_real/rosalind_ba1e.txt")
|
|
||||||
//BA1F("for_real/rosalind_ba1f.txt")
|
|
||||||
//BA1G("for_real/rosalind_ba1g.txt")
|
|
||||||
BA1H("for_real/rosalind_ba1h.txt") |
|
||||||
} |
|
@ -1,545 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"fmt" |
|
||||||
"sort" |
|
||||||
"errors" |
|
||||||
s "strings" |
|
||||||
) |
|
||||||
|
|
||||||
|
|
||||||
/* |
|
||||||
rosalind.go: |
|
||||||
|
|
||||||
This file contains core functions that |
|
||||||
are used to solve Rosalind problems. |
|
||||||
*/ |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1A
|
|
||||||
|
|
||||||
|
|
||||||
// Count occurrences of a substring pattern
|
|
||||||
// in a string input
|
|
||||||
func PatternCount(input string, pattern string) int { |
|
||||||
|
|
||||||
// Number of substring overlaps
|
|
||||||
var overlap = len(input) - len(pattern) + 1 |
|
||||||
|
|
||||||
// If overlap < 1, we are looking
|
|
||||||
// for a pattern longer than our input
|
|
||||||
if overlap<1 { |
|
||||||
return 0 |
|
||||||
} |
|
||||||
|
|
||||||
// Count of occurrences
|
|
||||||
count:=0 |
|
||||||
|
|
||||||
// Loop over each substring overlap
|
|
||||||
for i:=0; i<overlap; i++ { |
|
||||||
// Grab a slice of the full input
|
|
||||||
start:=i |
|
||||||
end:=i+len(pattern) |
|
||||||
var slice = input[start:end] |
|
||||||
if slice==pattern { |
|
||||||
count += 1 |
|
||||||
} |
|
||||||
} |
|
||||||
return count |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1B
|
|
||||||
|
|
||||||
|
|
||||||
// Return the histogram of kmers of length k
|
|
||||||
// found in the given input
|
|
||||||
func KmerHistogram(input string, k int) (map[string]int,error) { |
|
||||||
|
|
||||||
result := map[string]int{} |
|
||||||
|
|
||||||
if len(input)<1 { |
|
||||||
err := fmt.Sprintf("Error: input string was not DNA. Only characters ATCG are allowed, you had %s",input) |
|
||||||
return result, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Number of substring overlaps
|
|
||||||
overlap := len(input) - k + 1 |
|
||||||
|
|
||||||
// If overlap < 1, we are looking
|
|
||||||
// for kmers longer than our input
|
|
||||||
if overlap<1 { |
|
||||||
return result,nil |
|
||||||
} |
|
||||||
|
|
||||||
// Iterate over each position,
|
|
||||||
// extract the string,
|
|
||||||
// increment the count.
|
|
||||||
for i:=0; i<overlap; i++ { |
|
||||||
// Get the kmer of interest
|
|
||||||
substr := input[i:i+k] |
|
||||||
|
|
||||||
// If it doesn't exist, the value is 0
|
|
||||||
result[substr] += 1 |
|
||||||
} |
|
||||||
|
|
||||||
return result,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Find the most frequent kmer(s) in the kmer histogram,
|
|
||||||
// and return as a string array slice
|
|
||||||
func MostFrequentKmers(input string, k int) ([]string,error) { |
|
||||||
max := 0 |
|
||||||
|
|
||||||
// most frequent kmers
|
|
||||||
mfks := []string{} |
|
||||||
|
|
||||||
if k<1 { |
|
||||||
err := fmt.Sprintf("Error: MostFrequentKmers received a kmer size that was not a natural number: k = %d",k) |
|
||||||
return mfks, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
khist,err := KmerHistogram(input,k) |
|
||||||
|
|
||||||
if err != nil { |
|
||||||
err := fmt.Sprintf("Error: MostFrequentKmers failed when calling KmerHistogram()") |
|
||||||
return mfks, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
for kmer,freq := range khist { |
|
||||||
if freq > max { |
|
||||||
// We have a new maximum, and a new set of kmers
|
|
||||||
max = freq |
|
||||||
mfks = []string{kmer} |
|
||||||
} else if freq==max { |
|
||||||
// We have another maximum
|
|
||||||
mfks = append(mfks,kmer) |
|
||||||
} |
|
||||||
} |
|
||||||
return mfks,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Find the kmer(s) in the kmer histogram
|
|
||||||
// exceeding a count of N, and return as
|
|
||||||
// a string array slice
|
|
||||||
func MoreFrequentThanNKmers(input string, k, N int) ([]string,error) { |
|
||||||
|
|
||||||
// more frequent than n kmers
|
|
||||||
mftnks := []string{} |
|
||||||
|
|
||||||
if k<1 || N<1 { |
|
||||||
err := fmt.Sprintf("Error: MoreFrequentThanNKmers received a kmer or frequency size that was not a natural number: k = %d, N = %d",k,N) |
|
||||||
return mftnks, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
khist,err := KmerHistogram(input,k) |
|
||||||
|
|
||||||
if err != nil { |
|
||||||
err := fmt.Sprintf("Error: MoreFrequentThanNKmers failed when calling KmerHistogram()") |
|
||||||
return mftnks, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
for kmer,freq := range khist { |
|
||||||
if freq >= N { |
|
||||||
// Add another more frequent than n
|
|
||||||
mftnks = append(mftnks,kmer) |
|
||||||
} |
|
||||||
} |
|
||||||
return mftnks,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1C
|
|
||||||
|
|
||||||
|
|
||||||
// Reverse returns its argument string reversed
|
|
||||||
// rune-wise left to right.
|
|
||||||
// https://github.com/golang/example/blob/master/stringutil/reverse.go
|
|
||||||
func ReverseString(s string) string { |
|
||||||
r := []rune(s) |
|
||||||
for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 { |
|
||||||
r[i], r[j] = r[j], r[i] |
|
||||||
} |
|
||||||
return string(r) |
|
||||||
} |
|
||||||
|
|
||||||
// Given an alleged DNA input string,
|
|
||||||
// iterate through it character by character
|
|
||||||
// to ensure that it only contains ATGC.
|
|
||||||
// Returns true if this is DNA (ATGC only),
|
|
||||||
// false otherwise.
|
|
||||||
func CheckIsDNA(input string) bool { |
|
||||||
|
|
||||||
// Convert input to uppercase
|
|
||||||
input = s.ToUpper(input) |
|
||||||
|
|
||||||
// If any character is not ATCG, fail
|
|
||||||
for _, c := range input { |
|
||||||
if c!='A' && c!='T' && c!='C' && c!='G' { |
|
||||||
return false |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
// If we made it here, everything's gravy!
|
|
||||||
return true |
|
||||||
} |
|
||||||
|
|
||||||
// Convert a DNA string into four bitmasks:
|
|
||||||
// one each for ATGC. That is, for the DNA
|
|
||||||
// string AATCCGCT, it would become:
|
|
||||||
//
|
|
||||||
// bitmask[A] = 11000000
|
|
||||||
// bitmask[T] = 00100001
|
|
||||||
// bitmask[C] = 00011010
|
|
||||||
// bitmask[G] = 00000100
|
|
||||||
func DNA2Bitmasks(input string) (map[string][]bool,error) { |
|
||||||
|
|
||||||
// Convert input to uppercase
|
|
||||||
input = s.ToUpper(input) |
|
||||||
|
|
||||||
// Allocate space for the map
|
|
||||||
m := make(map[string][]bool) |
|
||||||
|
|
||||||
// Start by checking whether we have DNA
|
|
||||||
if CheckIsDNA(input)==false { |
|
||||||
err := fmt.Sprintf("Error: input string was not DNA. Only characters ATCG are allowed, you had %s",input) |
|
||||||
return m, errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Important: we want to iterate over the
|
|
||||||
// DNA string ONCE and only once. That means
|
|
||||||
// we need to have the bit vectors initialized
|
|
||||||
// already, and as we step through the DNA
|
|
||||||
// string, we access the appropriate index
|
|
||||||
// of the appropriate bit vector and set
|
|
||||||
// it to true.
|
|
||||||
m["A"] = make([]bool, len(input)) |
|
||||||
m["T"] = make([]bool, len(input)) |
|
||||||
m["C"] = make([]bool, len(input)) |
|
||||||
m["G"] = make([]bool, len(input)) |
|
||||||
|
|
||||||
// To begin with, every bit vector is false.
|
|
||||||
for i,c := range input { |
|
||||||
cs := string(c) |
|
||||||
// Get the corresponding bit vector - O(1)
|
|
||||||
bitty := m[cs] |
|
||||||
// Flip to true for this position - O(1)
|
|
||||||
bitty[i] = true |
|
||||||
} |
|
||||||
|
|
||||||
return m,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Convert four bitmasks (one each for ATGC)
|
|
||||||
// into a DNA string.
|
|
||||||
func Bitmasks2DNA(bitmasks map[string][]bool) (string,error) { |
|
||||||
|
|
||||||
// Verify ATGC keys are all present
|
|
||||||
_,Aok := bitmasks["A"] |
|
||||||
_,Tok := bitmasks["T"] |
|
||||||
_,Gok := bitmasks["G"] |
|
||||||
_,Cok := bitmasks["C"] |
|
||||||
if !(Aok && Tok && Gok && Cok) { |
|
||||||
err := fmt.Sprintf("Error: input bitmask was missing one of: ATGC (Keys present? A: %t, T: %t, G: %t, C: %t",Aok,Tok,Gok,Cok) |
|
||||||
return "", errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Hope that all bitmasks are the same size
|
|
||||||
size := len(bitmasks["A"]) |
|
||||||
|
|
||||||
// Make a rune array that we'll turn into
|
|
||||||
// a string for our final return value
|
|
||||||
dna := make([]rune,size) |
|
||||||
|
|
||||||
// Iterate over the bitmask, using only
|
|
||||||
// the index and not the mask value itself
|
|
||||||
for i, _ := range bitmasks["A"] { |
|
||||||
if bitmasks["A"][i] == true { |
|
||||||
dna[i] = 'A' |
|
||||||
} else if bitmasks["T"][i] == true { |
|
||||||
dna[i] = 'T' |
|
||||||
} else if bitmasks["G"][i] == true { |
|
||||||
dna[i] = 'G' |
|
||||||
} else if bitmasks["C"][i] == true { |
|
||||||
dna[i] = 'C' |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
return string(dna),nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Given a DNA input string, find the
|
|
||||||
// complement. The complement swaps
|
|
||||||
// Gs and Cs, and As and Ts.
|
|
||||||
func Complement(input string) (string,error) { |
|
||||||
|
|
||||||
// Convert input to uppercase
|
|
||||||
input = s.ToUpper(input) |
|
||||||
|
|
||||||
// Start by checking whether we have DNA
|
|
||||||
if CheckIsDNA(input)==false { |
|
||||||
return "", errors.New(fmt.Sprintf("Error: input string was not DNA. Only characters ATCG are allowed, you had %s",input)) |
|
||||||
} |
|
||||||
|
|
||||||
m,_ := DNA2Bitmasks(input) |
|
||||||
|
|
||||||
// Swap As and Ts
|
|
||||||
newT := m["A"] |
|
||||||
newA := m["T"] |
|
||||||
m["T"] = newT |
|
||||||
m["A"] = newA |
|
||||||
|
|
||||||
// Swap Cs and Gs
|
|
||||||
newG := m["C"] |
|
||||||
newC := m["G"] |
|
||||||
m["G"] = newG |
|
||||||
m["C"] = newC |
|
||||||
|
|
||||||
output,_ := Bitmasks2DNA(m) |
|
||||||
|
|
||||||
return output,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Given a DNA input string, find the
|
|
||||||
// reverse complement. The complement
|
|
||||||
// swaps Gs and Cs, and As and Ts.
|
|
||||||
// The reverse complement reverses that.
|
|
||||||
func ReverseComplement(input string) (string,error) { |
|
||||||
|
|
||||||
// Convert input to uppercase
|
|
||||||
input = s.ToUpper(input) |
|
||||||
|
|
||||||
// Start by checking whether we have DNA
|
|
||||||
if CheckIsDNA(input)==false { |
|
||||||
err := fmt.Sprintf("Error: input string was not DNA. Only characters ATCG are allowed, you had %s",input) |
|
||||||
return "", errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
comp,_ := Complement(input) |
|
||||||
|
|
||||||
revcomp := ReverseString(comp) |
|
||||||
|
|
||||||
return revcomp,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1D
|
|
||||||
|
|
||||||
|
|
||||||
// Given a large string (genome) and a string (pattern),
|
|
||||||
// find the zero-based indices where pattern occurs in genome.
|
|
||||||
func FindOccurrences(pattern, genome string) ([]int,error) { |
|
||||||
locations := []int{} |
|
||||||
slots := len(genome)-len(pattern)+1 |
|
||||||
|
|
||||||
if slots<1 { |
|
||||||
// pattern is longer than genome
|
|
||||||
return locations,nil |
|
||||||
} |
|
||||||
|
|
||||||
// Loop over each character,
|
|
||||||
// saving the position if it
|
|
||||||
// is the start of pattern
|
|
||||||
for i:=0; i<slots; i++ { |
|
||||||
start := i |
|
||||||
end := i+len(pattern) |
|
||||||
if genome[start:end]==pattern { |
|
||||||
locations = append(locations,i) |
|
||||||
} |
|
||||||
} |
|
||||||
return locations,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1E
|
|
||||||
|
|
||||||
// Find k-mers (patterns) of length k occuring at least
|
|
||||||
// t times over an interval of length L in a genome.
|
|
||||||
func FindClumps(genome string, k, L, t int) ([]string,error) { |
|
||||||
|
|
||||||
// Algorithm:
|
|
||||||
// allocate a list of kmers
|
|
||||||
// for each possible position of L window,
|
|
||||||
// feed string L to KmerHistogram()
|
|
||||||
// save any kmers with frequency > t
|
|
||||||
// return master list of saved kmers
|
|
||||||
|
|
||||||
L_slots := len(genome)-L+1 |
|
||||||
|
|
||||||
// Set kmers
|
|
||||||
kmers := map[string]bool{} |
|
||||||
|
|
||||||
// List kmers
|
|
||||||
kmers_list := []string{} |
|
||||||
|
|
||||||
// Loop over each possible window of length L
|
|
||||||
for iL:=0; iL<L_slots; iL++ { |
|
||||||
|
|
||||||
// Grab this portion of the genome
|
|
||||||
winstart := iL |
|
||||||
winend := iL+L |
|
||||||
genome_window := genome[winstart:winend] |
|
||||||
|
|
||||||
// Get the number of kmers that occur more
|
|
||||||
// frequently than t times
|
|
||||||
new_kmers,err := MoreFrequentThanNKmers(genome_window,k,t) |
|
||||||
if err!=nil { |
|
||||||
return kmers_list,err |
|
||||||
} |
|
||||||
// Add these to the set kmers
|
|
||||||
for _,new_kmer := range new_kmers { |
|
||||||
kmers[new_kmer] = true |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
for k := range kmers { |
|
||||||
kmers_list = append(kmers_list,k) |
|
||||||
} |
|
||||||
sort.Strings(kmers_list) |
|
||||||
|
|
||||||
return kmers_list,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1F
|
|
||||||
|
|
||||||
// The skew of a genome is the difference between
|
|
||||||
// the number of G and C codons that have occurred
|
|
||||||
// cumulatively in a given strand of DNA.
|
|
||||||
// This function computes the positions in the genome
|
|
||||||
// at which the cumulative skew is minimized.
|
|
||||||
func MinSkewPositions(genome string) ([]int,error) { |
|
||||||
|
|
||||||
n := len(genome) |
|
||||||
cumulative_skew := make([]int,n+1) |
|
||||||
|
|
||||||
// Get C/G bitmasks
|
|
||||||
bitmasks,err := DNA2Bitmasks(genome) |
|
||||||
if err!=nil { |
|
||||||
return cumulative_skew,err |
|
||||||
} |
|
||||||
c := bitmasks["C"] |
|
||||||
g := bitmasks["G"] |
|
||||||
|
|
||||||
// Init
|
|
||||||
cumulative_skew[0] = 0 |
|
||||||
|
|
||||||
// Make space to keep track of the
|
|
||||||
// minima we have encountered so far
|
|
||||||
min := 999 |
|
||||||
min_skew_ix := []int{} |
|
||||||
|
|
||||||
// At each position, compute the next skew value.
|
|
||||||
// We need two indices b/c for a genome of size N,
|
|
||||||
// the cumulative skew array index is of size N+1.
|
|
||||||
for i,ibit:=1,0; i<=n; i,ibit=i+1,ibit+1 { |
|
||||||
|
|
||||||
var next int |
|
||||||
// Next skew value
|
|
||||||
if c[ibit] { |
|
||||||
// C -1
|
|
||||||
next = -1 |
|
||||||
} else if g[ibit] { |
|
||||||
// G +1
|
|
||||||
next = 1 |
|
||||||
} else { |
|
||||||
next = 0 |
|
||||||
} |
|
||||||
cumulative_skew[i] = cumulative_skew[i-1] + next |
|
||||||
|
|
||||||
if cumulative_skew[i] < min { |
|
||||||
// New min and min_skew
|
|
||||||
min = cumulative_skew[i] |
|
||||||
min_skew_ix = []int{i} |
|
||||||
} else if cumulative_skew[i] == min { |
|
||||||
// Additional min and min_skew
|
|
||||||
min_skew_ix = append(min_skew_ix,i) |
|
||||||
} |
|
||||||
|
|
||||||
} |
|
||||||
return min_skew_ix,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1G
|
|
||||||
|
|
||||||
// Compute the Hamming distance between
|
|
||||||
// two strings. The Hamming distance is
|
|
||||||
// defined as the number of characters
|
|
||||||
// different between two strings.
|
|
||||||
func HammingDistance(p, q string) (int,error) { |
|
||||||
|
|
||||||
// Technically a Hamming distance when
|
|
||||||
// one string is empty would be 0, but
|
|
||||||
// we will throw an error instead.
|
|
||||||
if len(p)==0 || len(q)==0 { |
|
||||||
err := fmt.Sprintf("Error: HammingDistance: one or more arguments had length 0. len(p) = %d, len(q) = %d",len(p),len(q)) |
|
||||||
return -1,errors.New(err) |
|
||||||
} |
|
||||||
|
|
||||||
// Get longest length common to both
|
|
||||||
var m int |
|
||||||
if len(p)>len(q) { |
|
||||||
m = len(q) |
|
||||||
} else { |
|
||||||
m = len(p) |
|
||||||
} |
|
||||||
|
|
||||||
// Accumulate distance
|
|
||||||
dist := 0 |
|
||||||
for i:=0; i<m; i++ { |
|
||||||
if p[i]!=q[i] { |
|
||||||
dist += 1 |
|
||||||
} |
|
||||||
} |
|
||||||
return dist,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
////////////////////////////////
|
|
||||||
// BA1H
|
|
||||||
|
|
||||||
|
|
||||||
// Given a large string (text) and a string (pattern),
|
|
||||||
// find the zero-based indices where we have an occurrence
|
|
||||||
// of pattern or a string with Hamming distance d or less
|
|
||||||
// from pattern.
|
|
||||||
func FindApproximateOccurrences(pattern, text string, d int) ([]int,error) { |
|
||||||
|
|
||||||
locations := []int{} |
|
||||||
slots := len(text)-len(pattern)+1 |
|
||||||
|
|
||||||
if slots<1 { |
|
||||||
// pattern is longer than genome
|
|
||||||
return locations,nil |
|
||||||
} |
|
||||||
|
|
||||||
// Loop over each character,
|
|
||||||
// saving the position if it
|
|
||||||
// is the start of pattern
|
|
||||||
for i:=0; i<slots; i++ { |
|
||||||
start := i |
|
||||||
end := i+len(pattern) |
|
||||||
poss_approx_pattern := text[start:end] |
|
||||||
hamm,_ := HammingDistance(poss_approx_pattern,pattern) |
|
||||||
if hamm<=d { |
|
||||||
locations = append(locations,i) |
|
||||||
} |
|
||||||
} |
|
||||||
|
|
||||||
return locations,nil |
|
||||||
} |
|
||||||
|
|
||||||
|
|
@ -1,21 +0,0 @@ |
|||||||
https://github.com/moul/euler |
|
||||||
- use snakemake |
|
||||||
|
|
||||||
main.go is a cli: |
|
||||||
- given a problem... |
|
||||||
- print url for problem |
|
||||||
- duration |
|
||||||
- answer |
|
||||||
- awesome go |
|
||||||
|
|
||||||
ba1c test |
|
||||||
- not testing everything |
|
||||||
- finish |
|
||||||
|
|
||||||
code coverage |
|
||||||
- https://mlafeldt.github.io/blog/test-coverage-in-go/ |
|
||||||
- go lint |
|
||||||
- go test |
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1,95 +0,0 @@ |
|||||||
package main |
|
||||||
|
|
||||||
import ( |
|
||||||
"bufio" |
|
||||||
"fmt" |
|
||||||
"os" |
|
||||||
) |
|
||||||
|
|
||||||
// readLines reads a whole file into memory
|
|
||||||
// and returns a slice of its lines.
|
|
||||||
func readLines(path string) ([]string, error) { |
|
||||||
file, err := os.Open(path) |
|
||||||
if err != nil { |
|
||||||
return nil, err |
|
||||||
} |
|
||||||
defer file.Close() |
|
||||||
|
|
||||||
var lines []string |
|
||||||
scanner := bufio.NewScanner(file) |
|
||||||
buf := make([]byte, 2) |
|
||||||
|
|
||||||
// This is awkward.
|
|
||||||
// Scanners aren't good for big files,
|
|
||||||
// just simple stuff.
|
|
||||||
BIGNUMBER := 90000 |
|
||||||
scanner.Buffer(buf, BIGNUMBER) |
|
||||||
for scanner.Scan() { |
|
||||||
lines = append(lines, scanner.Text()) |
|
||||||
} |
|
||||||
return lines, scanner.Err() |
|
||||||
} |
|
||||||
|
|
||||||
// writeLines writes the lines to the given file.
|
|
||||||
func writeLines(lines []string, path string) error { |
|
||||||
file, err := os.Create(path) |
|
||||||
if err != nil { |
|
||||||
return err |
|
||||||
} |
|
||||||
defer file.Close() |
|
||||||
|
|
||||||
w := bufio.NewWriter(file) |
|
||||||
for _, line := range lines { |
|
||||||
fmt.Fprintln(w, line) |
|
||||||
} |
|
||||||
return w.Flush() |
|
||||||
} |
|
||||||
|
|
||||||
// Utility function: check if two string arrays/array slices
|
|
||||||
// are equal. This is necessary because of squirrely
|
|
||||||
// behavior when comparing arrays (of type [1]string)
|
|
||||||
// and slices (of type []string).
|
|
||||||
func EqualStringSlices(a, b []string) bool { |
|
||||||
if len(a)!=len(b) { |
|
||||||
return false |
|
||||||
} |
|
||||||
for i:=0; i<len(a); i++ { |
|
||||||
if a[i] != b[i] { |
|
||||||
return false |
|
||||||
} |
|
||||||
} |
|
||||||
return true |
|
||||||
} |
|
||||||
|
|
||||||
|
|
||||||
// Utility function: check if two boolean arrays/array slices
|
|
||||||
// are equal. This is necessary because of squirrely
|
|
||||||
// behavior when comparing arrays (of type [1]bool)
|
|
||||||
// and slices (of type []bool).
|
|
||||||
func EqualBoolSlices(a, b []bool) bool { |
|
||||||
if len(a)!=len(b) { |
|
||||||
return false |
|
||||||
} |
|
||||||
for i:=0; i<len(a); i++ { |
|
||||||
if a[i] != b[i] { |
|
||||||
return false |
|
||||||
} |
|
||||||
} |
|
||||||
return true |
|
||||||
} |
|
||||||
|
|
||||||
// Utility function: check if two int arrays/array slices
|
|
||||||
// are equal.
|
|
||||||
func EqualIntSlices(a, b []int) bool { |
|
||||||
if len(a)!=len(b) { |
|
||||||
return false |
|
||||||
} |
|
||||||
for i:=0; i<len(a); i++ { |
|
||||||
if a[i] != b[i] { |
|
||||||
return false |
|
||||||
} |
|
||||||
} |
|
||||||
return true |
|
||||||
} |
|
||||||
|
|
||||||
|
|
@ -0,0 +1,69 @@ |
|||||||
|
# Rosalind Chapter 1 |
||||||
|
|
||||||
|
This folder contains the `chapter1` module, which |
||||||
|
provides functions for each of the problems from |
||||||
|
Chapter 1 of Rosalind.info's Bionformatics Textbook |
||||||
|
track. |
||||||
|
|
||||||
|
## How to run |
||||||
|
|
||||||
|
* Each problem has its own function (example: `BA1a(...)`) |
||||||
|
|
||||||
|
* Each problem expects an input file |
||||||
|
(example input files in `for_real` directory, |
||||||
|
or provide the input file downloaded |
||||||
|
from Rosalind.info) |
||||||
|
|
||||||
|
* Pass the input file name to the function, like this: |
||||||
|
`BA1a("rosalind_ba1a.txt")` |
||||||
|
|
||||||
|
## Quick Start |
||||||
|
|
||||||
|
To use the functions in this package, start by installing it: |
||||||
|
|
||||||
|
``` |
||||||
|
go get github.com/charlesreid1/go-rosalind/chapter1 |
||||||
|
``` |
||||||
|
|
||||||
|
Once you have installed the `chapter1` package, |
||||||
|
you can import it, then call the function for whichever |
||||||
|
Rosalind.info problem you want to solve from Chapter 1: |
||||||
|
|
||||||
|
``` |
||||||
|
package main |
||||||
|
|
||||||
|
import ( |
||||||
|
rch1 "github.com/charlesreid1/go-rosalind/chapter1" |
||||||
|
) |
||||||
|
|
||||||
|
func main() { |
||||||
|
rch1.BA1a("rosalind_ba1a.txt") |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
## Examples |
||||||
|
|
||||||
|
See `chapter1_test.go` for examples. |
||||||
|
|
||||||
|
## Tests |
||||||
|
|
||||||
|
To run tests of all Chapter 1 problems, run |
||||||
|
`go test` from this directory: |
||||||
|
|
||||||
|
``` |
||||||
|
go test -v |
||||||
|
``` |
||||||
|
|
||||||
|
or, from the parent directory, the root of the |
||||||
|
go-rosalind repository: |
||||||
|
|
||||||
|
``` |
||||||
|
go test -v ./chapter1/... |
||||||
|
``` |
||||||
|
|
||||||
|
Note that this solves every problem in |
||||||
|
Chapter 1 and prints the solutions (so there |
||||||
|
is a lot of spew). It does not check the |
||||||
|
solutions (for that, see the tests in the |
||||||
|
`rosalind` library.) |
||||||
|
|
@ -0,0 +1,55 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1a: Most Frequent k-mers
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1aDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1a:", |
||||||
|
"Most Frequest k-mers", |
||||||
|
"", |
||||||
|
"Given an input string and a length k,", |
||||||
|
"report the k-mer or k-mers that occur", |
||||||
|
"most frequently.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1a/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem,
|
||||||
|
// print the name of the input file,
|
||||||
|
// print the output/result
|
||||||
|
func BA1a(filename string) { |
||||||
|
|
||||||
|
BA1aDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
var input, pattern string |
||||||
|
input = lines[0] |
||||||
|
pattern = lines[1] |
||||||
|
|
||||||
|
result := rosa.PatternCount(input, pattern) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(result) |
||||||
|
} |
@ -0,0 +1,59 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1b: Most Frequent k-mers
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1bDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1b:", |
||||||
|
"Most Frequest k-mers", |
||||||
|
"", |
||||||
|
"Given an input string and a length k,", |
||||||
|
"report the k-mer or k-mers that occur", |
||||||
|
"most frequently.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1b/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1b(filename string) { |
||||||
|
|
||||||
|
BA1bDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
k_str := lines[1] |
||||||
|
|
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
mfks, _ := rosa.MostFrequentKmers(input, k) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(mfks, " ")) |
||||||
|
} |
@ -0,0 +1,51 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1c: Find the Reverse Complement of a String
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1cDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1c:", |
||||||
|
"Find the Reverse Complement of a String", |
||||||
|
"", |
||||||
|
"Given a DNA input string,", |
||||||
|
"find the reverse complement", |
||||||
|
"of the DNA string.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1c/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1c(filename string) { |
||||||
|
|
||||||
|
BA1cDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
|
||||||
|
result, _ := rosa.ReverseComplement(input) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(result) |
||||||
|
} |
@ -0,0 +1,61 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1d: Find all occurrences of pattern in string
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1dDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1d:", |
||||||
|
"Find all occurrences of pattern in string", |
||||||
|
"", |
||||||
|
"Given a string input (genome) and a substring (pattern),", |
||||||
|
"return all starting positions in the genome where the", |
||||||
|
"pattern occurs in the genome.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1d/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1d(filename string) { |
||||||
|
|
||||||
|
BA1dDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
pattern := lines[0] |
||||||
|
genome := lines[1] |
||||||
|
|
||||||
|
// Result is a slice of ints
|
||||||
|
locs, _ := rosa.FindOccurrences(pattern, genome) |
||||||
|
|
||||||
|
// Convert to a slice of strings for easier printing
|
||||||
|
locs_str := make([]string, len(locs)) |
||||||
|
for i, j := range locs { |
||||||
|
locs_str[i] = strconv.Itoa(j) |
||||||
|
} |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(locs_str, " ")) |
||||||
|
} |
@ -0,0 +1,59 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1e: Find patterns forming clumps in a string
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1eDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1e:", |
||||||
|
"Find patterns forming clumps in a string", |
||||||
|
"", |
||||||
|
"A clump is characterized by integers L and t", |
||||||
|
"if there is an interval in the genome of length L", |
||||||
|
"in which a given pattern occurs t or more times.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1e/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1e(filename string) { |
||||||
|
|
||||||
|
BA1eDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
genome := lines[0] |
||||||
|
params_str := lines[1] |
||||||
|
params_slice := strings.Split(params_str, " ") |
||||||
|
|
||||||
|
k, _ := strconv.Atoi(params_slice[0]) |
||||||
|
L, _ := strconv.Atoi(params_slice[1]) |
||||||
|
t, _ := strconv.Atoi(params_slice[2]) |
||||||
|
|
||||||
|
patterns, _ := rosa.FindClumps(genome, k, L, t) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(patterns, " ")) |
||||||
|
} |
@ -0,0 +1,61 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1f: Find positions in a gene that minimizing skew
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1fDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1f:", |
||||||
|
"Find positions in a gene that minimize skew", |
||||||
|
"", |
||||||
|
"The skew of a genome is defined as the difference", |
||||||
|
"between the number of C codons and the number of G", |
||||||
|
"codons. Given a DNA string, this function should", |
||||||
|
"compute the cumulative skew for each position in", |
||||||
|
"the genome, and report the indices where the skew", |
||||||
|
"value is minimzed.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1f/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1f(filename string) { |
||||||
|
|
||||||
|
BA1fDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
genome := lines[0] |
||||||
|
|
||||||
|
minskew, _ := rosa.MinSkewPositions(genome) |
||||||
|
|
||||||
|
minskew_str := make([]string, len(minskew)) |
||||||
|
for i, j := range minskew { |
||||||
|
minskew_str[i] = strconv.Itoa(j) |
||||||
|
} |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(minskew_str, " ")) |
||||||
|
} |
@ -0,0 +1,53 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1g: Find Hamming distance between two DNA strings
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1gDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1g:", |
||||||
|
"Find Hamming distance between two DNA strings", |
||||||
|
"", |
||||||
|
"The Hamming distance between two strings HammingDistance(p,q)", |
||||||
|
"is the number of characters different between the two", |
||||||
|
"strands. This program computes the Hamming distance", |
||||||
|
"between two strings.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1g/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1g(filename string) { |
||||||
|
|
||||||
|
BA1gDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
p := lines[0] |
||||||
|
q := lines[1] |
||||||
|
|
||||||
|
hamm, _ := rosa.HammingDistance(p, q) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(hamm) |
||||||
|
} |
@ -0,0 +1,66 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1h: Find approximate occurrences of pattern in string
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1hDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1h:", |
||||||
|
"Find approximate occurrences of pattern in string", |
||||||
|
"", |
||||||
|
"Given a string Text and a string Pattern, and a maximum", |
||||||
|
"Hamming distance d, return all locations in Text where", |
||||||
|
"there is an approximate match with Pattern (i.e., a pattern", |
||||||
|
"with a Hamming distance from Pattern of d or less).", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1h/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1h(filename string) { |
||||||
|
|
||||||
|
BA1hDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
pattern := lines[0] |
||||||
|
text := lines[1] |
||||||
|
d_str := lines[2] |
||||||
|
|
||||||
|
d, _ := strconv.Atoi(d_str) |
||||||
|
|
||||||
|
approx, _ := rosa.FindApproximateOccurrences(pattern, text, d) |
||||||
|
|
||||||
|
approx_str := make([]string, len(approx)) |
||||||
|
for i, j := range approx { |
||||||
|
approx_str[i] = strconv.Itoa(j) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: conversion from int to string: %v", err) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(approx_str, " ")) |
||||||
|
} |
@ -0,0 +1,70 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1i: Most Frequent Words with Mismatches
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1iDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1i:", |
||||||
|
"Most Frequent Words with Mismatches", |
||||||
|
"", |
||||||
|
"Given an input string and a maximum allowable", |
||||||
|
"Hamming distance d, report the most frequent", |
||||||
|
"kmer that either occurs or whose Hamming neighbors", |
||||||
|
"occur most frequently.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1i/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1i(filename string) { |
||||||
|
|
||||||
|
BA1iDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
params := strings.Split(lines[1], " ") |
||||||
|
if len(params) < 1 { |
||||||
|
log.Fatalf("Error splitting second line: only found 0-1 tokens") |
||||||
|
} |
||||||
|
|
||||||
|
k_str, d_str := params[0], params[1] |
||||||
|
|
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for parameter k: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
d, err := strconv.Atoi(d_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for parameter d: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
mfks_mis, _ := rosa.MostFrequentKmersMismatches(input, k, d) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(mfks_mis, " ")) |
||||||
|
} |
@ -0,0 +1,71 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1j: Most Frequent Words with Mismatches and Reverse Complements
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1jDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1j:", |
||||||
|
"Most Frequent Words with Mismatches and Reverse Complements", |
||||||
|
"", |
||||||
|
"Given an input string and a maximum allowable", |
||||||
|
"Hamming distance d, report the most frequent", |
||||||
|
"kmer that either occurs or whose Hamming neighbors", |
||||||
|
"occur most frequently in the input string and in the", |
||||||
|
"reverse complement of the input string.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1j/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1j(filename string) { |
||||||
|
|
||||||
|
BA1jDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
params := strings.Split(lines[1], " ") |
||||||
|
if len(params) < 1 { |
||||||
|
log.Fatalf("Error splitting second line: only found 0-1 tokens") |
||||||
|
} |
||||||
|
|
||||||
|
k_str, d_str := params[0], params[1] |
||||||
|
|
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for parameter k: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
d, err := strconv.Atoi(d_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for parameter d: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
mfks_mis, _ := rosa.MostFrequentKmersMismatchesRevComp(input, k, d) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(mfks_mis, " ")) |
||||||
|
} |
@ -0,0 +1,62 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1k: Generate Frequency Array
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1kDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1k:", |
||||||
|
"Generate Frequency Array", |
||||||
|
"", |
||||||
|
"Given an integer k, generate the frequency array of", |
||||||
|
"an input string. The frequency array is an array of", |
||||||
|
"counts with one count per index, and integers mapped", |
||||||
|
"to kmers.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1k/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1k(filename string) { |
||||||
|
|
||||||
|
BA1kDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
k_str := lines[1] |
||||||
|
|
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for parameter k: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
arr, _ := rosa.FrequencyArray(input, k) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
for _, e := range arr { |
||||||
|
fmt.Print(e, " ") |
||||||
|
} |
||||||
|
//fmt.Println(strings.Join(arr, " "))
|
||||||
|
} |
@ -0,0 +1,51 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1L: Pattern to Number
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1LDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1L:", |
||||||
|
"Pattern to Number", |
||||||
|
"", |
||||||
|
"Given an input kmer of length k, convert it to", |
||||||
|
"an integer corresponding to its lexicographic", |
||||||
|
"order among kmers of length k.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1l/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1L(filename string) { |
||||||
|
|
||||||
|
BA1LDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
|
||||||
|
number, _ := rosa.PatternToNumber(input) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(number) |
||||||
|
} |
@ -0,0 +1,62 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1m: Pattern to Number
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1mDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1m:", |
||||||
|
"Number to Pattern", |
||||||
|
"", |
||||||
|
"Given an integer and a kmer length k, convert", |
||||||
|
"the integer to its corresponding kmer.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1m/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1m(filename string) { |
||||||
|
|
||||||
|
BA1mDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
number_str := lines[0] |
||||||
|
k_str := lines[1] |
||||||
|
|
||||||
|
number, err := strconv.Atoi(number_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for number: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for k: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
result, _ := rosa.NumberToPattern(number, k) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(result) |
||||||
|
} |
@ -0,0 +1,60 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Rosalind: Problem BA1n: Calculating d-Neighborhood of String
|
||||||
|
|
||||||
|
// Describe the problem
|
||||||
|
func BA1nDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA1n:", |
||||||
|
"Calculating d-Neighborhood of String", |
||||||
|
"", |
||||||
|
"Given an input string of DNA and a Hamming", |
||||||
|
"distance d, compute all DNA strings that", |
||||||
|
"are a Hamming distance of up to d away.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba1n/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Describe the problem, and call the function
|
||||||
|
func BA1n(filename string) { |
||||||
|
|
||||||
|
BA1nDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
input := lines[0] |
||||||
|
d_str := lines[1] |
||||||
|
|
||||||
|
d, err := strconv.Atoi(d_str) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error: string to int conversion for d: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
result, _ := rosa.VisitHammingNeighbors(input, d) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
for _, j := range result { |
||||||
|
fmt.Println(j) |
||||||
|
} |
||||||
|
} |
@ -0,0 +1,20 @@ |
|||||||
|
package rosalindchapter1 |
||||||
|
|
||||||
|
import "testing" |
||||||
|
|
||||||
|
func TestChapter01(t *testing.T) { |
||||||
|
BA1a("for_real/rosalind_ba1a.txt") |
||||||
|
BA1b("for_real/rosalind_ba1b.txt") |
||||||
|
BA1c("for_real/rosalind_ba1c.txt") |
||||||
|
BA1d("for_real/rosalind_ba1d.txt") |
||||||
|
BA1e("for_real/rosalind_ba1e.txt") |
||||||
|
BA1f("for_real/rosalind_ba1f.txt") |
||||||
|
BA1g("for_real/rosalind_ba1g.txt") |
||||||
|
BA1h("for_real/rosalind_ba1h.txt") |
||||||
|
BA1i("for_real/rosalind_ba1i.txt") |
||||||
|
BA1j("for_real/rosalind_ba1j.txt") |
||||||
|
BA1k("for_real/rosalind_ba1k.txt") |
||||||
|
BA1L("for_real/rosalind_ba1l.txt") |
||||||
|
BA1m("for_real/rosalind_ba1m.txt") |
||||||
|
BA1n("for_real/rosalind_ba1n.txt") |
||||||
|
} |
@ -0,0 +1,2 @@ |
|||||||
|
CAGTGTAAGTAACGGATTGAGGACGTAACGGACTAGTATTCGAGGACAGTGTAATTGAGGACGTAACGGAGTAACGGATCGAGGACTAGTATCAGTGTAATTGAGGACGTAACGGAGTAACGGACAGTGTAACAGTGTAACTAGTATGTAACGGACAGTGTAAGTAACGGAGTAACGGAGTAACGGATCGAGGATTGAGGACCTAGTATCTAGTATTCGAGGATCGAGGATTGAGGACCTAGTATCTAGTATGTAACGGATTGAGGACTTGAGGACCTAGTATTCGAGGATCGAGGAGTAACGGACAGTGTAACAGTGTAATCGAGGATCGAGGACAGTGTAATTGAGGACTCGAGGACTAGTATTTGAGGACTCGAGGATTGAGGACGTAACGGAGTAACGGATCGAGGACTAGTATGTAACGGAGTAACGGACAGTGTAACTAGTATTTGAGGACCAGTGTAACAGTGTAACAGTGTAACAGTGTAACAGTGTAACTAGTATGTAACGGAGTAACGGATTGAGGACGTAACGGAGTAACGGATCGAGGATTGAGGACCTAGTATTTGAGGACGTAACGGATTGAGGACCTAGTATCTAGTATCAGTGTAACTAGTATGTAACGGATCGAGGATCGAGGACAGTGTAATTGAGGACTTGAGGACCAGTGTAATCGAGGATTGAGGACTTGAGGACTTGAGGACTCGAGGACAGTGTAAGTAACGGAGTAACGGATCGAGGACAGTGTAATTGAGGACCTAGTATTTGAGGACCTAGTATGTAACGGATTGAGGACCAGTGTAACTAGTATCTAGTATCTAGTATCAGTGTAATTGAGGACTCGAGGATTGAGGAC |
||||||
|
6 2 |
@ -0,0 +1,2 @@ |
|||||||
|
TTACTCGCTGGCAGGTTGACGGAGAAATATTGGTGACGGAGAAGACGGAGAATGGGCATATATTGGTTGGCAGGTTTGGGCATTTACTCGCGACGGAGAATTACTCGCTGGGCATTTACTCGCTGGGCATTTACTCGCTGGCAGGTTTGGCAGGTTATATTGGTATATTGGTATATTGGTTGGGCATTTACTCGCGACGGAGAATGGCAGGTTGACGGAGAAGACGGAGAAATATTGGTTTACTCGCATATTGGTGACGGAGAAATATTGGTTTACTCGCTTACTCGCTGGGCATTGGGCATTGGCAGGTTGACGGAGAAGACGGAGAATTACTCGCATATTGGTTTACTCGCGACGGAGAATTACTCGCATATTGGTGACGGAGAAGACGGAGAATTACTCGCTGGCAGGTTTGGGCATTGGGCATTTACTCGCTGGCAGGTTTGGGCATTGGCAGGTTGACGGAGAAGACGGAGAATGGCAGGTTTGGCAGGTTTGGCAGGTTTGGCAGGTTTGGGCATGACGGAGAATTACTCGCTGGCAGGTTTTACTCGCTGGCAGGTTTTACTCGCATATTGGTTGGCAGGTTTTACTCGCTTACTCGCTTACTCGCGACGGAGAAGACGGAGAAATATTGGTATATTGGTATATTGGTTGGCAGGTTTGGCAGGTTTGGCAGGTTATATTGGTTTACTCGCTTACTCGCATATTGGTTGGCAGGTTTGGGCATTGGCAGGTTTGGCAGGTTGACGGAGAATGGCAGGTTGACGGAGAAGACGGAGAATGGGCATTGGGCATGACGGAGAATGGCAGGTT |
||||||
|
5 3 |
@ -0,0 +1,2 @@ |
|||||||
|
CAATGAGTGATATTGTTTGGTAGCAATCCATAGTTGAGGCCCTACGGAAGTTGCATCCGGGGCCCGTAGGACTCGCGGGCAAAAGATTGCTAAGCATTCTTGGTCACCATCGCAGTATTGCTCGTAGTCGGGTGGGTTTGCCGAACTGATAATGTGCCAGTCCCCGCGGAACCGGAATCAGGGCAACGGCTAGAGATACTCTCCGTGGGTCCTAAGTAGGAGGCTTGGGGCTGAGTGAGCAACCACTTACTCGAGTGTGTTGTTTTCTGTGCGTCCCCCGGGCGGTGTTCATTTAAGGATGACCGGGTGAGTAACCGAACAATTTTGTTGCCATGAAACGCGGCAATAACTCAATCTACCAGTACGGACAAATATAATGTTGGGCCCTTTTAGCTTAACGGACGTCGTCCCATTCTGACCTTAACTAAGACTATAAGGTAGGGGGTCAGATACGACACGGTCAGTAGGTGGATATACCGTGACAAATACCGGCACCTATGCTAATTGCGATTTGGAATGGAACGCGCCGAATACTTCGGATCATATCACCGTCCCTGTACTCGAAAGTTCTGCCACGAACAAGTCTCCTACTTGTGTCTTTTCTCACTGCGAAG |
||||||
|
5 |
@ -0,0 +1,69 @@ |
|||||||
|
# Rosalind Chapter 2 |
||||||
|
|
||||||
|
This folder contains the `chapter2` module, which |
||||||
|
provides functions for each of the problems from |
||||||
|
Chapter 2 of Rosalind.info's Bionformatics Textbook |
||||||
|
track. |
||||||
|
|
||||||
|
## How to run |
||||||
|
|
||||||
|
* Each problem has its own function (example: `BA2a(...)`) |
||||||
|
|
||||||
|
* Each problem expects an input file |
||||||
|
(example input files in `for_real` directory, |
||||||
|
or provide the input file downloaded |
||||||
|
from Rosalind.info) |
||||||
|
|
||||||
|
* Pass the input file name to the function, like this: |
||||||
|
`BA2a("rosalind_ba2a.txt")` |
||||||
|
|
||||||
|
## Quick Start |
||||||
|
|
||||||
|
To use the functions in this package, start by installing it: |
||||||
|
|
||||||
|
``` |
||||||
|
go get github.com/charlesreid1/go-rosalind/chapter2 |
||||||
|
``` |
||||||
|
|
||||||
|
Once you have installed the `chapter2` package, |
||||||
|
you can import it, then call the function for whichever |
||||||
|
Rosalind.info problem you want to solve from Chapter 2: |
||||||
|
|
||||||
|
``` |
||||||
|
package main |
||||||
|
|
||||||
|
import ( |
||||||
|
rch1 "github.com/charlesreid1/go-rosalind/chapter2" |
||||||
|
) |
||||||
|
|
||||||
|
func main() { |
||||||
|
rch1.BA2a("rosalind_ba2a.txt") |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
## Examples |
||||||
|
|
||||||
|
See `chapter2_test.go` for examples. |
||||||
|
|
||||||
|
## Tests |
||||||
|
|
||||||
|
To run tests of all Chapter 2 problems, run |
||||||
|
`go test` from this directory: |
||||||
|
|
||||||
|
``` |
||||||
|
go test -v |
||||||
|
``` |
||||||
|
|
||||||
|
or, from the parent directory, the root of the |
||||||
|
go-rosalind repository: |
||||||
|
|
||||||
|
``` |
||||||
|
go test -v ./chapter2/... |
||||||
|
``` |
||||||
|
|
||||||
|
Note that this solves every problem in |
||||||
|
Chapter 2 and prints the solutions (so there |
||||||
|
is a lot of spew). It does not check the |
||||||
|
solutions (for that, see the tests in the |
||||||
|
`rosalind` library.) |
||||||
|
|
@ -0,0 +1,67 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2a: Implement Motif Enumeration
|
||||||
|
func BA2aDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2a:", |
||||||
|
"Implement Motif Enumeration", |
||||||
|
"", |
||||||
|
"Given a collection of strings of DNA, find all motifs (kmers of length k and Hamming distance d from all DNA strings).", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2a/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2a(filename string) { |
||||||
|
|
||||||
|
BA2aDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
params := strings.Split(lines[0], " ") |
||||||
|
k, _ := strconv.Atoi(params[0]) |
||||||
|
d, _ := strconv.Atoi(params[1]) |
||||||
|
|
||||||
|
// 1 line in the input file is for
|
||||||
|
// parameters/gold standard.
|
||||||
|
// The rest of the lines are DNA strings.
|
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
results, _ := rosa.FindMotifs(dna, k, d) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(results, " ")) |
||||||
|
} |
@ -0,0 +1,61 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2b: Find a Median String
|
||||||
|
func BA2bDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2b:", |
||||||
|
"Find a Median String", |
||||||
|
"", |
||||||
|
"Given a kmer length k and a set of strings of DNA, find the kmer(s) that minimize the L1 norm of the distance from it to all other DNA strings.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2b/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2b(filename string) { |
||||||
|
|
||||||
|
BA2bDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
k_str := lines[0] |
||||||
|
k, _ := strconv.Atoi(k_str) |
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
results, _ := rosa.MedianString(dna, k) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(results) |
||||||
|
} |
@ -0,0 +1,54 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2c: Find a Profile-most Probable k-mer in a String
|
||||||
|
func BA2cDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2c:", |
||||||
|
"Find a Profile-most Probable k-mer in a String", |
||||||
|
"", |
||||||
|
"Given a profile matrix, find the most probable k-mer to generate the given DNA string.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2c/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2c(filename string) { |
||||||
|
|
||||||
|
BA2cDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
dna := lines[0] |
||||||
|
k_str := lines[1] |
||||||
|
k, _ := strconv.Atoi(k_str) |
||||||
|
|
||||||
|
// To make multidimensional slice,
|
||||||
|
// make a slice, then loop and make more slices
|
||||||
|
profile, _ := rosa.ReadMatrix32(lines[2:6], k) |
||||||
|
|
||||||
|
// Find the most probable kmer
|
||||||
|
result, _ := rosa.ProfileMostProbableKmers(dna, k, profile) |
||||||
|
fmt.Println(strings.Join(result, " ")) |
||||||
|
} |
@ -0,0 +1,67 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2d: Implement GreedyMotifSearch
|
||||||
|
func BA2dDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2d:", |
||||||
|
"Implement GreedyMotifSearch", |
||||||
|
"", |
||||||
|
"Find a collection of motif strings using a greedy motif search. Return first-occurring profile-most probable kmer.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2d/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2d(filename string) { |
||||||
|
|
||||||
|
BA2dDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
//// Input file contents
|
||||||
|
params := strings.Split(lines[0], " ") |
||||||
|
k, _ := strconv.Atoi(params[0]) |
||||||
|
t, _ := strconv.Atoi(params[1]) |
||||||
|
|
||||||
|
// 1 line in the input file is for
|
||||||
|
// parameters.
|
||||||
|
// The rest of the lines are DNA strings.
|
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
result, _ := rosa.GreedyMotifSearchNoPseudocounts(dna, k, t) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(result, " ")) |
||||||
|
} |
@ -0,0 +1,67 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2e: Implement GreedyMotifSearch with Pseudocounts
|
||||||
|
func BA2eDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2e:", |
||||||
|
"Implement GreedyMotifSearch with Pseudocounts", |
||||||
|
"", |
||||||
|
"Re-implement problem BA2d (greedy motif search) using pseudocounts, which avoid setting probabilities to an absolute value of zero.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2e/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2e(filename string) { |
||||||
|
|
||||||
|
BA2eDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
params := strings.Split(lines[0], " ") |
||||||
|
k, _ := strconv.Atoi(params[0]) |
||||||
|
t, _ := strconv.Atoi(params[1]) |
||||||
|
|
||||||
|
// 1 line in the input file is for
|
||||||
|
// parameters.
|
||||||
|
// The rest of the lines are DNA strings.
|
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
result, _ := rosa.GreedyMotifSearchPseudocounts(dna, k, t) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(result, " ")) |
||||||
|
} |
@ -0,0 +1,64 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2f: Implement RandomizedMotifSearch with Pseudocounts
|
||||||
|
func BA2fDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2f:", |
||||||
|
"Implement RandomizedMotifSearch with Pseudocounts", |
||||||
|
"", |
||||||
|
"Re-implement problem BA2e (greedy motif search with pseudocounts) but use a random, instead of greedy, algorithm to pick motif kmers from each DNA string.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2f/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2f(filename string) { |
||||||
|
|
||||||
|
BA2fDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
params := strings.Split(lines[0], " ") |
||||||
|
k, _ := strconv.Atoi(params[0]) |
||||||
|
t, _ := strconv.Atoi(params[1]) |
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
n := 100 |
||||||
|
result, _ := rosa.ManyRandomMotifSearches(dna, k, t, n) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(result, "\n")) |
||||||
|
} |
@ -0,0 +1,65 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA2g: Implement GibbsSampler
|
||||||
|
func BA2gDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA2g:", |
||||||
|
"Implement GibbsSampler", |
||||||
|
"", |
||||||
|
"Generate probabilities of each kmer in a DNA string using its profile. Use these to assemble a list of probabilities. GibbsSampler uses this random number generator to generate a random k-mer.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba2g/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA2g(filename string) { |
||||||
|
|
||||||
|
BA2gDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
params := strings.Split(lines[0], " ") |
||||||
|
k, _ := strconv.Atoi(params[0]) |
||||||
|
t, _ := strconv.Atoi(params[1]) |
||||||
|
|
||||||
|
// Make space for DNA strings
|
||||||
|
dna := make([]string, len(lines)-1) |
||||||
|
iLstart := 1 |
||||||
|
iLend := len(lines) |
||||||
|
// Two counters:
|
||||||
|
// one for the line index (iL),
|
||||||
|
// one for the array index (iA).
|
||||||
|
for iA, iL := 0, iLstart; iL < iLend; iA, iL = iA+1, iL+1 { |
||||||
|
dna[iA] = lines[iL] |
||||||
|
} |
||||||
|
|
||||||
|
n := 100 |
||||||
|
n_starts := 20 |
||||||
|
result, _ := rosa.ManyGibbsSamplers(dna, k, t, n, n_starts) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(strings.Join(result, "\n")) |
||||||
|
} |
@ -0,0 +1,13 @@ |
|||||||
|
package rosalindchapter2 |
||||||
|
|
||||||
|
import "testing" |
||||||
|
|
||||||
|
func TestChapter02(t *testing.T) { |
||||||
|
//BA2a("for_real/rosalind_ba2a.txt")
|
||||||
|
//BA2b("for_real/rosalind_ba2b.txt")
|
||||||
|
//BA2c("for_real/rosalind_ba2c.txt")
|
||||||
|
//BA2d("for_real/rosalind_ba2d.txt")
|
||||||
|
//BA2e("for_real/rosalind_ba2e.txt")
|
||||||
|
//BA2f("for_real/rosalind_ba2f.txt")
|
||||||
|
BA2g("for_real/rosalind_ba2g.txt") |
||||||
|
} |
@ -0,0 +1,11 @@ |
|||||||
|
5 1 |
||||||
|
GATTTGGGCCAAAGTCTGCGGCGAA |
||||||
|
GATGTGCGTCAACCAGTCGGAGTCC |
||||||
|
TCACACCGGCTCGGAGATTTTTTTT |
||||||
|
GATCTACAACGCGTGACTATATGCT |
||||||
|
TAAGTGATTTTGTGGCCTTTACTCG |
||||||
|
CCATCTACCCGATGTTCGACCGCGT |
||||||
|
GAGCGCGCTGCCTACATTTGGATCT |
||||||
|
TCCGGGTTAGGATGTTGAAACAAAA |
||||||
|
ATGGAGCCATGATATGTACACTTAG |
||||||
|
GCATGGATCTTACTCCGACGTTATC |
@ -0,0 +1,11 @@ |
|||||||
|
6 |
||||||
|
CCCTAGTCTACCTGTTTGGAGCGGGGCCTGAATTTGACTGGC |
||||||
|
GTCTTTACCGAGTTAGTCTGATGTAAGTACTGCTCCTCTACC |
||||||
|
CCGACATTGCGCTCTACTCTGCGCACATAACTAAACGTTGCA |
||||||
|
CCTCCGTCTACATAGAAGGAGTCTGCAACGCCCCCACTGAGG |
||||||
|
ATCTTGCTCGTATCTACCGATAAGTAGCGAAAATCTAGCGTT |
||||||
|
CGGGGTTACCTGGCAGTGTCTACTAGATCAGATTGCCCGGCT |
||||||
|
TTAGTAAATGAATCTACGTCTCTGAGCGCGCGAATCAGGGTG |
||||||
|
TGAGCACTCTGACTTAACTCTACTACTCTCCAATAAGCGCTC |
||||||
|
TCACGTTCTACACTAGGTAAGTATGCATATTTGCATGAGTCT |
||||||
|
TTTGAAGAAGGCTCTACAAATTTAAACCCAGACTCAGACACG |
@ -0,0 +1,6 @@ |
|||||||
|
GTCACAGCTGCATAACAAGTAAACTGAGAAATCCCCAGTTAGGCGGATTGACCATCGAACACACTTTCACTACTTGCGGATAAATCCTGTAGAACTAGACTTTATCTCGGCTGCGACAAGACAGGAGTTCATGCACCTGCTCTGTCCCTCGCAACAGTCTAGGGAGCAAGTAGGCGGCTTCTTAGCTAGTACCTGGGTAG |
||||||
|
7 |
||||||
|
0.393 0.286 0.286 0.25 0.179 0.321 0.107 |
||||||
|
0.071 0.357 0.25 0.286 0.214 0.393 0.357 |
||||||
|
0.214 0.214 0.143 0.286 0.25 0.143 0.25 |
||||||
|
0.321 0.143 0.321 0.179 0.357 0.143 0.286 |
@ -0,0 +1,26 @@ |
|||||||
|
12 25 |
||||||
|
AGATCCGGTTTTATTCAAGCGAATTAGTGGGAGTGCGAGCATGCGCCAGATTCGTCCGGGATTGTCGTTAGGACACTAAACAGAGTCAGGTGCAGTGAGGAACCGGTCCTCCTTGCTGTCCATCTTTGGCTATCAATCGCTTTGCGGGCGGCATGC |
||||||
|
CGAGCATCCCTTTAACATAATTGCCCGTGGGTGTATTGCGTTTTTCCAACGCATAAGAGCATCTTATGTGTTTATGCGTGGAAGCCTATCACTTTGCATAGCGTTTGGCGATCACCTCCATGCCGCAAGGCCTAAGGCACACGGTTAATTGGGTCA |
||||||
|
AACGAGGCGAACCCTGGAACAGGTACCATGCCTTTGCGATTCAGCTTCTATCCCCGTCTAATTAGACATCTCAGCGTTCCTCAAGCTAGCAGACTGCACAGGGCTTATCCCCGGATGGTCGCTACTTCTCTGTGCATATAGCACGTAATGCCACAT |
||||||
|
CTTCCCGTCGAAATGCTACATAGACTGAGCGATACATGCGGTGCAGTTAGTTTGTTGACCTTATCCCACACTACAACGGCCTGTTACATTGCGCGTGTCTTATGCAAATCGATCGCTTTGTAACCGTAATCCACCATTTCTGGAAAGCATTTCCAG |
||||||
|
ATTAAACATTCCAGCAACACGCGGGCGATCCTGAGGAATCACCGCAACTCACGTCTAGAGCCTGTCCGGCACTCGATTACTTTGTCTTTCGAACCCCGTTGGTTAGTGCACTCTGTCATATAGTGCTAGGCTGCCCTCTCAGACGCGCTCAGTCGT |
||||||
|
TCGGTGTGTACACCTGGTAGAGGAGGAACCAATTAAACTTCGTGAACCCAAGGCGGCCCCCCATTCAGTTCGACTGGGACTCCGGCGCTTTTATGCGCGCGTAGAGGCAGTGACAAGGCTTCCGGTTAAGTCTTCTTTACTGACGCCATGCCTTTG |
||||||
|
CGTATCTCTGTTTAGGCTCCCACCCCGATACCTTTGTTTCTCATATGAGCGCTTGTCTCGCCGCCAGATATCTGACTGGTCCGGTGATCAATGCTTAGGCGTTCAGGTTTACTACTGTCGCGACAAGACGGTCATACGCGCCAAAGGCTTCACAGC |
||||||
|
AAGCGAAGTCCTTTGAATACTAAAACTCACCACTGGGCCGTCCCGACTATAAGTTGTCGCGAACAGAGTTTCTGTTACTTACCTCACTATCTTGCATCCATTCCTTTGGGTATTTGGGTTGTACACGCTATACGATCATGATTAGTCTCTATGCCT |
||||||
|
TAACAACGATGCGGTTCCGTAATCGTAGTGAGAAAACCGGGTAGGAAGTAAGTGTGCATGAACGTTAGGCGCGTCTTGAAGCCAGATGGGTAGCTGGCTAATGTTTCTGCCATAGGACTGGATCACTTGTGCCCAACAGGAACAGCAATTCCTTTG |
||||||
|
GCGATGACTTTGACGGCAGATCCGACCTCGGCTTAGTATGGTGGATGAACCTCCAAGTCACCGGGTCCTAGCATTATTTCGAATGGCCGAGGAGGCCATCATTAGGTAACGCCCAGAGTACATCCCCCGAACACCGAAGGTCGTTCGCGTCCGGCA |
||||||
|
CCTAACGTACCATTTTTGACTGGAAGCCAAAGTTGACCGGCTTTTATAGCTTTTGACGGTCTCCTGTACTCAAGTAGATTTTTGTTAACAAACCTGGCATTGTCGTCATACAGTCAGGGAAGATACTTCCCTAGCTGCACCCACCCAATAGCTTTG |
||||||
|
TGCTCTGACCAGACGATGGCTTTGCTGGAGGTTGAAGGCCATTTTTTTGTTCTAGTGCCCGACAGCTTCATGAGGGCGGTCGACTCTGAGGCTTGAGCAAAACCTAATATAAATGCTGAAGCTTAGCGCACGGCACGGAAATTGGGGGGAACTACT |
||||||
|
CGGAAGCGTTTATGACGGCGACAGGAGTAACCATGAAGAGGAACAGGCGCGACGATGGAACCGCCTTACTACGTTCCGTCACGCCACCCGAGTGGAGTCGGTACCGTTAAGCTGACGGCGCGCTATTCTCTCCTGATTAGGTTACCTATGCCTTTG |
||||||
|
GATGTAGCCATATAAATCATTCATCGTTATTGTGGGCTCTTGTCTACCGTATACACACACCCAATCCCTTTGGGCATTATTCGACTATCCCCTACCTCGCCTACTGCTGATACCACGTTTTAGGCTCCGTTTCATATATATCCCCCTAAACAAGGG |
||||||
|
GATGGAGCGTTGGCGAACCGCTGAGCGAGCTATGAACAGCCTGTGAGACGCGGGGTAGGAGCCATCACTTTGGATCGTTCCCAGTCTTTCTATTATCAGTATCGATATGCGGCAACCAGTTTTCTTGCGCTCTGAACCATCCTATAGTAGAACTTC |
||||||
|
TCCTATACGTAGCCTCGTCCGGCCTGACGTGTCCGGATTCATTTAGAGGCCATTACTTTGCTGTCAGTCGCTGCACTCATGTCGATTGTCGTGGTTGATTTAAAGACCCGCATAGCACAGTACCCTAACCCCAACTTCTCTCTGTTTAGACAGTGC |
||||||
|
GAGCTTTGTATGGAGATTGCGCTTCCGATTGCTTTGAACATCGGACGCGCTTATAGAGACACTCGTGCTGGCAGACCGGTGCGCGATAAACGAATCTCGGCGTGCATTGGTGTTTGGGCTTCCGATGTCAAAGACCGCAGAACTGCGCCGGGGAAT |
||||||
|
CGATCTTCAAAGGCTGGCTTGCATTAGGAGGACTGTGAAGAACACGCTTCTCTTATGACTGCACGGCGGTTGACTACGTCGCTTTGGGGCCACCCTTTCATTGCATGAACAATACCTTTGGTCTTTGACTGATCTTGAGGAGTCCACCGGATCACT |
||||||
|
ACATTTCAAACACACTGTATGGGTTACCCTAATTCGCTGCGCATGCGCTGGGCCTCGAGCGAAGAATGTACGTGCTTTAGCTACTGTCAGTCTATCCAACGAAACTACGGCTTACGTGGTTACAGACCCCATGCTGGTTGGGAATCGATTTCTTTG |
||||||
|
TATAAAGAAGTAGGTCCGTCAGATTCGAGGAATCCTCGATGTCCCTGGTACATGCAAAAGTTCAGAGCCGTAGAACTACTGTAGGCGATTGCTTTGCGCAAAAGGGATCAGTCGCCGTCGTAACTCAAATTTAGTCTTTTCACCAACGTGCAGGGA |
||||||
|
TTTGAGTCATTATTAACGGTGTACGGAGTGACGCCCCCAATGCCTTTGTCCGGCTTGTACCGGATTATCCGCTTGAGTAACTTATTCTTATCTGAGATGTCGGTGGATATTGCCACTTAATCGAAACGATCGTACCTCGCCCGAGTCCTAGCAGCG |
||||||
|
CGCACGTGAATGTAGGAGCCAATCCGGCCTCTTTAGTGCTCCAATCACTAAGGGTAGATTTGTCGCACCACCCGTATGTGATCCCTCAAAGCGAAATCATCTACACTCTCCATAGCTTTGAAATCCAATAGTACAACCTCGGCCGGGTAATCACCA |
||||||
|
ACCATATCTTTGCGGACTTCCGAAAAGATCGAAAAAATAGCTTACTGACCCCCAACCTTGAGGTAAGAGCGGTCCCTCGGTCAGGCGGAACTTCCAGTGTCCGATTAGATCAGGCCGCATAGTGTGGGACTCCGATCAAGTGTATAATATGCAGGT |
||||||
|
GCGGGGGGAGTTTGCTAGGACAGTCGGGCGGTAGTTTGTGTCTTAAGTAACTGCTCGAAGGCTAGAATGTGGGATCATAGCTTCAGCGGATTCCTAGCGATGGCTTTGAAACATGGACGAGTTACTTTTGGCGTTTTTGAGAGTTTATAAGGTGAG |
||||||
|
CCAAACATGGTGGTCACTATTAATTGTCCTCCGCGTACCGAGATACGAGGGGAGTCCTCCCACAATTCGTCGCCGATTTCTTTGAGTCAGGGTATCATAGGGAGTGCTATTCCATAGCGATAACTGCTCCACAGAAGTTCATTAAGTATTTTTTCT |
@ -0,0 +1,26 @@ |
|||||||
|
12 25 |
||||||
|
GATGTGCGTCACAATCCCGCCCTCCAGCTGAACTAGCCAATACTTCCTCTTTCTGCTTCATGATTCACCCAATAGACACTAGGGCTTATACGGGGTGTGTACTTCCCACTGTGGGGCGAGCTGAGTCCATAGTCATGGGCCCGCCTCATCTAAACC |
||||||
|
TCATAGCCGCCGCGCTGGTGGGTGCAAGTCTGCCGACTCCCACCTTATATGTAGCAGGTCACGTAGATAGGAGTGTGTTATATCCTCCGATAATCCCTAAATATAGGATGATACTGGTTCTGCCAGACTCTGTCTGTCTTGAAGTCGCTAGATGAA |
||||||
|
GCGTATAGAAGGAATCCGACTTGGACGCCATTCAGATCAACTAACATAAACAGCAGTAAGCACTTAGCTAGTTGATACCGACGCATACAGAGCGGTCGCGAAAGTCGAGACCGTTCCTTGGTTCAGCAATTCGTGGCTGCCGTCCTCTGGTAAGCC |
||||||
|
ATTGTATGGAACGGTGATGTGCTACTCGCTTAAACCTATAAGCGCACATTTGCAGTCCATGTGATACCTTCCAAGATGTTATTGGCGTGGAAATTAATAAGACGAGACTATTCGTCCTCTGGGTTAAGCTGGTATTTAGAAAGATTATCCAGCAAT |
||||||
|
CGGGCCGCAATGGGCGCCATTACGTTTTTGGTTTAATCATTGCTGGATCTCCCAACCCACTCGCCTGCGGGACGTTCGCTCAATCTCCGCCCACGTCTGTGCTCTATCTCAACGATAGTCCTCGACTATAGCCTCATGTAAACCCGGGAGACCTTT |
||||||
|
ACACAGCGCCTCGCAGGGGGTGTGAGGACCGACGTTGTAAAGGCCATACCGACTCTCAAGGCATTTTGGGAACCCCCACCTGTCTGTGCCATTGAACTCTACTAACCCTAATTCACCTATTGCCGAGCCTTCGGATTTGACGAATACGCATTGCGG |
||||||
|
AGCTAGCCGCTTGCCAAACCGCTCTCAGAAGGAACGAGCCTAATTCTCCTCACGTAATCCGGCCAGTTCATAGGTGTGCAAAAACTTAACACACGTTCGGGTGGGGTTAGCAGCGCAGAATTGAACCCCTTCACGCCGACAGGGTGGCTAACTACA |
||||||
|
ATTCATGATTGCTGAGTCATTGAAACACACCAGAAACGTCAGCGGCTAGACTATGTACCAGCGGAAGCTGGGATTCCTTTAGAAGCTGTTCCCATACTTGGTGGGTGATCCTATGATACTCTCGGTTAATCCGTGCGAATTTAGCTGTCCCTGAGT |
||||||
|
TGGTTGTGATATTAACCAAGTGCGCTCCCCTAAACCTCGAACCTGGGCCATTTAGATGACTTAACGCCGCCTAGCAGCGCCGCCGTGGTCTTACAACTAGATAGAACTGCGAGGCATTCTGCGGGGCGTAAGTGTCGTCACAGCGATCATGGATTC |
||||||
|
AACTTTCATGTCCACGCGCAACCTTCGGTTTCGTCCCTCTGCTAAACCAATTGGTCATTTCTATTGCTGGACCCACCCGCAGGCGGTCGAAAAGAAATACGCTGCCCCGAATACAGCTCCGATTCCTCTTGCCTCCCAGACACAGATGGTAACTTA |
||||||
|
GGTCCGCGTTGTAGTACACGTAAAGCCTTATCGGGGGTCCTAAATCTAGTACCTGGCAGACTCACATAAACCGCTCTCCCCGCCTGGAATCTAATGTCTGTAACTGATCCTTTTCCTTCATATCAATACCTCTGTATCAGTGTCCAAAGGTCAGCT |
||||||
|
AGATGAGGTGCAGTGGGGGGAGTGATGCTAAGTGGTCCCACTGAACTCATTGAACCCATACTCGGTGGGACTAACATCTACAGCCGCACGTCAGAACTACAAACAAGATCGGCACGAACGTTCGCGACTATTTACATCTTAGCGCTCTACTAATCC |
||||||
|
CGCGTTGTTGCCCCCGGATTACTACTCATAGTCTGGACAACTGAGACGATAGCTCATCATCGCGGTAACATGCGTGATTACGGGGAAAATTATGGGGTGACCCAGATCCGCTTCGACGCGCTCACCTAAGCCAACACGCGGGTATCGGTCTAGTAT |
||||||
|
TCTGACGCTTTCCCCACGAGAAGATCATCCCGGTTGTTACGATCCTTCCTCCAATAAACCTGCGCCCCATACTGAGCGGTACCCTTTCAAAGGTAGAGTGCTACCATGCGATGTATCTGAATACCATAAACGTGTGAGTAGGATTGTGGGGGTACA |
||||||
|
GACCGAAAATCGATGGCCAGACCCATCGAAGTCCGCCCCAAGTTGGCTATGAGTTTAATGTGCGCGTTGCTGCTATGAGGATGTCAGGCTGGGCCGAGGGAATGAGAAAGTTTCTGACCTCGAATGAGAGTGAGTCCTCCCGCACTCCGCTAAGCC |
||||||
|
TTAAGGTCTCTGACCGTTGTAAAAATTGTGCCCCCCCATCTTGTAATCGCTAACGAAAGGAAATCTTCTTGACAACCACAACGAGTAACTGCCGACTCGGTGGGTAGCCAGCGGTTGACGAACTGGACAATGCTCTAGTAAACCCCGGTCGACAGC |
||||||
|
TAAGAAACCTTGTCGACTGGGAAACTCGCGTAAACCTTTGCGGAGGCTCCTTATCGTTACCTGACTTCCAGAGGGACTACCGTCGAGTCAAGGGGTCAGTTAGGTAAGCACACGATGCATGCAACCCTTGCTGACTTCTTCAATTTCCGTGACCGT |
||||||
|
GACCATCAAGCCCGTAAATCGATGGTATCCTATGGTATCGTCACCACAGCCTCGTCACTTAAACCAGGCGTAAACCGAGTGAAACGACTCGGCTCGCAGATAGAACCCGTGTAGAAAACCATGTGAAACAAATGAATCACTTTACTCGGGTAAGCC |
||||||
|
TCAGTGTTAGCCTCGGGTCAGAACGGTCTATTGAAGTTAATGACACTCGGGTAGTCGGCGCTCAGCTAATCCGCCGCTAAGCCACGAGCAGGCGACAAAGGAGAATCTGGCGACGGGTAGCACGATAGTTGGGGAGGCCTCGCCTAGGTTAGACAG |
||||||
|
CTCGACTAATCCGAATTACACTGCACCTCCGAGTGGGTCTGGTTACTGCCGGTAGGGAGGCCAAATAGCTTGCCCAACTCGATAAGTCCTATGACGATCGGGCTGTCATAACAAGATTAATAGGGATGTCAAACCGTAGGGTGCACAGACAATTCC |
||||||
|
ATCCAATTCTCGATGGGATTACACCCTAACCAGACTGGGGATCCATAAGTCTTTTTGCCTCTCGCGTAATCCGCGCATGACTGTATACACTTCCCCAACGGGGGGTCAGTGTTTTCTATTCCCGCAGCACGCCCCGTCTAGCCGACCCGAAGGTGC |
||||||
|
GATGTCAGATTACACCATTTCTGGCGCGTTTTGAAAGATCGGACCTTCATATGGGTTCCTGCTCAGCGTGGACCAATGAGAATGGAGAGCCATGAATTAGCACTACGACTCTCCTAAACCATGATTCTGATTTTCTGATCTTCCCATCAGCCGTAC |
||||||
|
GCTAATCCTCCTGAGTTACGAGCATCCATGCGATAAACAAGACCGATTCACATCCAAATTGGCCGTCTCTGTGATGCTGGGCCGGTGAAGTTGACTTCGTAGAGTTTATCTCCAGCCTGCAACCTGAAGGATCTCGACTAACCCTTAAGCGAGCTG |
||||||
|
CCGCTCGAAGATCCCCTCTTGCAGCACGGTGCAGGTTCGGCAGGCTGAAGTCTACACCGCTTTGGTGACAAAGCGAATGACTCACTTAGGCCCGCGCATAGGGCAACCGTACATCACCGACAGAGTGTACAGCTCGGCTAAACCAGACATACGCTT |
||||||
|
GGGGAGCGGTGTCGAAAGAGAAGGCATCTCTGAAGGAGTTAAAAACCACGATTTGAAAGTCCTCTGTATATGCTCGCGTAAGCCTTGCTTTTCCCACTGAGGCTACACAGGCGAGTCCAGCTAATGACGGCGTTCTCATCTCAATGTTGGCGACTG |
@ -0,0 +1,21 @@ |
|||||||
|
15 20 |
||||||
|
CCCCGAGTAATTCCCAGATATAACGTACTCAAATGTTGAAAACAAGTGACCACTGTATCCACGAGGGGTGTAACTCTTATCAATGGCATAAGGGCCACGAGGACTCTACCATAAGAGCACAGCGCCAGCTGACGAATGAGTATCTCTGACCAACCGATTGCGATCTGTTGTTGGCAGATAGGCCCGCAGGACCCCGAGTAATTCCC |
||||||
|
AGATATAACGTACTCAAATGTTGAAAACAAGTGACCACTGTATCCACGAGGGGTGTAACTCTTATCAATGGCATAAGGGCCACGAGGACTCTACCATAAGAGCACAGCGCCAGCTGACGAATGAGAGATTTGACTACTAGTATCTCTGACCAACCGATTGCGATCTGTTGTTGGCAGATAGGCCCGCAGGACCCCGAGTAATTCCC |
||||||
|
ATAGTGCGTACACCACAAGTGAGATATGATACTTCGACCCAGAGGTAAAGATAAGATCTAGTATTAACCCCGGAGCGAAGGGAGAATGGTACGATCTTGAACAGACTACTCATCGCCGATATGAGTCGAAGATAATGCTGTCATCAAAAGTGGCTTTGTTGAGGTTAACACTGTAGACTGGATGCAAGGCCGATGAATTATAAGTC |
||||||
|
CGGGCTCGGAGAACAGACTAGGGGTACGAAAAGGTTCCGAAATTAGCACGCGCGCGTATAAAAAGATCGACGACCATGCCCGAGTTAGCTCACAGGAACAACTTTGGATAGTTAGATCCCAGCTGACAGTTCGAACTACGCAATCAGGGCTCCTCTGGATTCATACTCTAAGCATGAGAAGGCACAGAGCAAACAGCTACTTGGAT |
||||||
|
CTTCAGTTAATGATGCCTCAGAGGTCGGCGTTGAACCGCGTAACAGACTACTATCTTTATGCGCAGTACAGTTGTAATATGACTAAGGCGCCCGCGAACCGTTCCAACGTGCCGAGAAGGTTGGCCTACAAGGAAGAAGCCGGTCATTCAGTCTTCAAGGCCAACGGTCCTGCACAGATGATTACGCACCGATCAGTATAATGTGT |
||||||
|
CCATTGGGTGAGTTGATTCCATGATTCGTAGAAGCCACTACTAGGTGAGCTAGGCTCCTCTACAGTATAGAGAAGAGCCTTTAAGCCTATCCTGGAGCCTCTCACCCCACAATCGTAAGAACTTGGGTGCGTGAATGACTGAAGTATACATCACCTTAACTCATATTGTTGATCCGCTGTTGTCTGATTGGTAGGCTTGGTAGCGC |
||||||
|
CGAGCGCTTTTTGCACACCGAACGTGTCAGTTCCACATGAGCGTGACAGAGTGCCCGCGCATGGGGTAATCCCGTATCAGAACAGTAAACTAGGTCATGTCCTCCATCGTCTTAGAAGGGGACAACCCCGCAGGGTATGCTAAGAAGTGGAGTAGAGAGTGTTGTGCTGAACACGCGTATCCGGCGGTTTCAAAGTCCAAGGTTGA |
||||||
|
TGTCGTCCCTCTTCTTTTCACTCACATGTATGCCGCTAATACAGACCAACTAAAGAAGAACCAGCTACTAGTGCCATACCTCAAAAGCTAGAACTGTAATAACACACCGCTCGTTGTGGGCCGATTGTATATTAGTAAAGCAGCCAATATTTGTAACACGGCCGATGACCGTGCAAGTTTCCCTTGGAGGCAATGGCATCAAATTC |
||||||
|
TGAGGTGATTGTTTATCCAGATTGGCTGTTTGTCTGAGAAGCTTGTAGCGAGATTCGACTACTAGCTATAGCGACTCAAAATGCTGCGCATTCCCAACTAAAGTAAAACGCAAGCTTAGATGCAGAATTGAGATCACTTGTCTGGATTCATTTTTAATGTGGCGCTACAGGTGCATGTCATGCCCGGTAGGTTGAGAGGCTCTCAA |
||||||
|
ACGTGGAGCGATCCTACCGGTGTATGTGTACCATCCTGGCTGAAAGGCAGTCGACTGGCCACCGTTCGGGCGGCTGCGTTAGAGCTGACTACTCGCATAGGTCTAGAGCGATGACCCCTTGTTTGTCAGCGTATATCCTGGGTAATCGTTGTACCGCCTTTCAGGCCGATCCTAGGCAAGACTACTAGAACTAGGAGAACATTGCG |
||||||
|
TAGACCGCCACCCAGGGTGCTTCCTTAGATCAATCCCGCGTGTAATTAAGCGTAGGGGAGACGCTCCTGAGAACAGACTCTCAGCTAAGCGTAGGAGGAGCATTTTTTTTGGATGAGGCCTCCTCTGTGGATACAAACGGCTCGGTCAACCAGCCACACGAATCAAGTGGAGGATATGTTGTAGTCCGCTCATTCCGAACTTTTAC |
||||||
|
GATATTTACGCAGTCGGACACCCCTCTTCCTTAATGCTTCGTTAAAATATCGGTCCGTGCGAGTTCTGGTGGCCCGTAGGCCTCGCTACTTAGAGGTGGGCCCCCAAGGGCACGCATAATCGGCGCCTACCGCGGTAGATCAAAGTGAACGAAGATCTTGTAAAGATTCTATGCGGAGGCAGAACATCTTACTAGGTAAGAGATTC |
||||||
|
GCTAAAGCTATCCAAGGATACCGGGCGGTGCTAAACTATTACAGTATAACATCAATCAATCACCGCTTCGCCTTCCTCGTAAGTCATACTGCATATGGGCTTCCACTATAGAGTAATCCCGTGAGTGGACGAGAAAGCACTACTAGACAACACGGACACCGTTACAGTCATAGTTGCGGCAGTGCTCATAAGTCCTTCACACAGGG |
||||||
|
TCCCGCTATATCTGATCTTTGTGACTGCTGGGGTATAACTCAGCTGGCACAGAACAGACCCGTAGTCCAGAAAGACTTGAGTATAATGACGCCATGCTCGACCGGAGGTCAAGTACGGGAGAGGCTTACGCAGAATCCCCCAAGAGGCTCGTTAACTGACCGCAATCGATTTCAATTCCCTCCGAGCTCACCGAGCGCTGGTATAG |
||||||
|
TTCCTGACTGCTTGAGCCAGCGCTATATTGCGCGCAAACCAGTCGCGTGACGTACCGTCATACCGAGTGATGTAAAGTTGATTATTGGGATCAGCTAATTCCTCGCGGTGTTAGTTCATCACTTTTGAGTCCGACAGACTACTAGTTCACTACTTCGAAGTTTGCTCTTACAAGCGAGGAACGGCCTGCCGAGTATACAGGGGCGT |
||||||
|
CAGGCCTCGCGATTAACCTTATGCGGCCTCTCAGAGCCCCGTCTAGCGAAGGTAATATGAGAACAGACTACAGAGACTACGCTCTGCCCCCTAAGGACTGGGAATTTATGGCCCTATATCGCCCTTATTCGCCATAACTTCTAAGATTTGCTATTACCATTCTGAGGCAGTTTAACTAAATGGCTATCCAACCTATGAGGACTGAA |
||||||
|
GATCCAGAACAACTTACTAGTCCGTAGGGCTGGACCTCTTCATGCCCGGGCGTCGGGCACATACGCGTATCAAAATGGGAGATGGCATTATCTACTTCTCCTGTGATTTTGAACGTAGTCACCCCACATCATACTTTTCACACCTCGTACTGGTGATTCCATTCTACCCAACGATACGTAATTGACCCCGCTTTTGATTGGAATTT |
||||||
|
CAAGAGTCGTACGAGCCCTCCGTCATCAATGCTTGCGATTAGAGTTCACGGTAGAATCCACCAGAGCAGAAGAGAACGCCAGTAGCGACCAGAAAGCCTTTAGAAAAGGCAGACTACTAGAATGTTGTGTGTAAGTGTACCAAACATTGATTCGGACGGTTGTCGTGTTCGAACCAGGTGATTTGGTGAGGTTTCAGCGCCTAGTG |
||||||
|
TTGTGATGATCTTCGTAAACATCCGGTGGGAGCCCCCCTCTCCTCGATGACTGTTTGACTATGATCCATTTACCTTGACTCGCAGGACGACAAGCCATTATTCATGCCGTGTGATAAGAACGCTCTACTAGGTTCTGTCCATGCCCAGCACAGATCAAGGGACCCGGCGGGCCCGGGTCAGAACTTTGGTCACCATTCGCAAATCA |
||||||
|
AGAGGCACCTGGCGCCAAAGGCATTTAATGAACATGGCGAACTGCCAGACGAGCATGGTAGAACAGAAGCCTAGGACCATCCCGACATAACAACCACTATTTATAATTGAACTATCTTGGCACACACGCTATTGGCGTTGCACTGAGACCGTTCATCGCCTTCACTGTGACCATTCGCCTATAGACATATAACTAACTTGGCTTCA |
@ -0,0 +1,21 @@ |
|||||||
|
15 20 2000 |
||||||
|
AGTGGTTTAATCGGACGAGGCGTGTCCCTCAGCCCGATAACCATCCCGTCCTGTGTGCGACCGTTGAGCATCGTATTAGTTCCGTAGGATTTTGCGGTCGTCTATTTGATATAAAGTCAGGTATATATGGCCACAAGTTCGCGTGGACCGTTAGCGCACCAACACTGTAATATAACTGCCTTAAGGTAGCGACTCGCCAAGCGCAGGGGCAGCCCTGACAGTTTCCACGAAACTCAAGAGAGTATGTAGCGACAGTCCTTCGCAAGACAATCGTACGTGTCTACCGAAACTTAATTTCGTTAGTGGTTTAATCGGA |
||||||
|
CGAGGCGTGTCCCTCAGCCCGATAACCATCCCGTCCTGTGTGCGACCGTTGAGCATCGTATTAGTTCCGTAGGATTTTGCGGTCGTCTATTTGATATAAAGTCAGGTATATATGGCCACAAGTTCGCGTGGACCGTTAGCGCACCAACACTGTAATATAACTGCCTTAAGGTAGCGACTCGCCAAGCGCAGGGGCATCGAAGAGTGTGCATGCCCTGACAGTTTCCACGAAACTCAAGAGAGTATGTAGCGACAGTCCTTCGCAAGACAATCGTACGTGTCTACCGAAACTTAATTTCGTTAGTGGTTTAATCGGA |
||||||
|
TCGCTAAGTGGTGATACCGGCTGATAAGAAAGTAAGATTTCAGCATGACCCTGTTGATTCCACCCCTTCCTTTCATGGTGAGGCTTGTCTTTGCGGCGCCTCACGGTACCTGTGGACTGTACACACGAAGCACAACTTCCGAACTATTCGTTTGTAGACTATAAATCACCATGCTCATCAAGCTCAAAATTTCTCCTTACACCGACCGCGGTGGGAAAAAACGCAACGAAGCTCCAATTATCTCCAGTCTCTGCACGTGTAGAGATGGTGGAAAGCTAAGAGATGCCTTCGCCACATTAAGTCCCGCACAACGTTA |
||||||
|
ATTGGCAAAACCGATAGGATCCCGCGACTATGACGTCGCTTTTCGCTAAGTGTGGGCTGACCCTCCTACAAATAAGTCTCGTTTTAACCCTGGCCATTGCTTACAACCGCCGAAGTCGCGCTTCAATCAAAGGTGCAGGGTTATAATAGACATACAATTAGGATGTTTGACCGACATGCCTTGTTAACTTTAATTGACGTTACAGATTGATTATGCGATCTCTTTATGTTCTCAATTTAATATACCTCCGCTGGTTCCTATTGGGAGCCTTCAACACATAATAAATCCTTGTACCTCTGATTGAGTCTCTTTGCCT |
||||||
|
ATGTTCCTTAAGTAATTAATAGTACGTACACCGGTATTCGCTAGCCGTGCATCTTGACCCCCCCAAGGCGAACAGTTTGGATTTGCGAAGTCCCACGAAGGGGGCTTAAGGCTTGAGCCACATCCAGTTATGAAAGTATATCATCGGCACCCAGGAGGCTAGACAGGAGGTCAGAAAATTCCGCATTAGCGTCGTTGCGCAAGGCCGTCGCCGCCCGTGCTTCCAGGATTAGATCGCCTGCCAGACAGTCCGACTCCGTTGACAATAGAAGACAAGCTTATGCCCCGATTCACTCACCACCCAGACAGGCCCGCAT |
||||||
|
TGCTCGGACTGATATCCGCGTATGCGTACGTAATGTCTAGCAGGCGGTCGAGCCATAGGCTTCAATAGGGGTGTTGCGACTAAGCGATTGGCACAGGAAGCATTGGAATTAACACCGCAGTCATCTAAGTGTGCATACGGGCATGTGGAGATTTTTCTACGAATGATGCGTCAACGACCCATGGAATGAGTTTTTAGTTGTTACCCATTTTTATAATAACGTGCGCGGTTTATCTTATCCCTTATAATGATCTCTAACATAGGCGTACCTGAAAAGAATGCATTAAGCCGCAAAGGAGCCCAATTCTCAGCCGTCG |
||||||
|
AATCTAAGTTACTTCATGGTTCACGGTGCCACATCGACTGATCATCCATGCCTAGTGCTGTACTTAACCCATCATATTTCCTAAGTGCTTCGAACCCTTCGATCGGGGTGGTCATCTGTCCGTGACAAGGCTGCTAATAACCCACGCCGGTATAACACTGATTACGTTATACGCCTTATTCGGCAGTGACTGGCGTGCCACGTGCCAGGTAAAAAGAAATCTGGAACAGGGCTCCTCTTTCATAAGTGTGCATTTAAATAAGGGGAATAGTAAGGTCTCATTTGTAGTGCACGTGCCTTTCAATTATAGGCCCATA |
||||||
|
CGCCATCCACGTTTTAGTAATGTACCCAGGCCAACTAACACATAGCAACCGTCAGTTTTCACAGTTGTCATCTTGCCGCCCGAATAAGCCCCGCTGACCAACGTCTGAGGACGTTCTCCGCGGAGATGAGGGTATAGCGTCGTCGTACCTGCATTACCGAACAACTCTCCATCTTTAGGGAATTACCCATCTTAGCTATAGGACACAAGAGTCGACAGTAATTGTGGACTGGCTTTTGCGGTCTCGGTTCAATCAAGGAAAACCCTCTTGCACTACAATCGCAGCGTGTGCATTCAGAGCCCTTATCATACCTCGA |
||||||
|
AGGCGAGGTGAGGTCATGACCTGTCTAACCCCTTAGCGCCGGTGTAAATTCAATGCACGTAACGCTAGAGGCCTTAGGCCTCGATTCATCCTTGTGATCCATGTAATCGAACGACGCCTATCTAGTTCCGGAGCTTCGAACGAGGCCGATTAAAAACCCGTTGGCCGGTTGATCTGTGCTGCTCAAGATGTAACATCCCGAACTCAGCGCATCACGCCCCGCCAGAGCCTTTGGGAGCAGGGCCAGCGCTCGCTGGTTGTGCATGCGCTGCAATTCAAACACCTGTGGCACGAGTGCCCCAAAGGACCATCATCGA |
||||||
|
TTGCAATACAGTGCCCCTTGTGCTGTTTGCTAGGCGAATACAGGCGACCGACACAAAGCCCGGCCCTATATCAGTACGAGGCAACATACTGCTCGCTAAACTAGCGTATAAATTTGGACACCATAATGCGCAATGCACGCGGTATAGGTGGTCTTGTGGTAAGAGGGATTTCTAAATATCGTATTCCCCAACGCAGGTATACGAAGCCCATGGTAATTTAAGCGTTTAAACAGCTAGAACTCGCTCGCTCTTTGTGCATGTACTAGGTCCTTGTGTAGAAGAGGCGCAATGCCTTGCTATAGACCTTTGTCCCGTC |
||||||
|
GTGCGAAAATGCATAATTATAACTTTTCGCCTCGGGCGCGTCCACGGTATTACGAAGTCGAAGCGCGCATCCACTGACAATTCACAGACAGCAAAAATTGTTGCATTTAATCACGGGGACCCTATGGTGCAGTTGCGAGACCACATATGACCGGCTTTGTAGCACCGGCCCCAGTTTAATTCCCCTGACTAATGGTAGATACGACGACCGCCCCCCACAGACTCACTCACCTCCAGCACCAGCTCAACGGTGACCCCTTTCTGTTCTAATGCGGTATTCGCTAAGTGTCACTAAGGTTAGTGCGGCTTTTGCTGCA |
||||||
|
GTAGTCCGTGGCATATGGAAGGGGAGCTTTACTCCCTGATCGGTGAGTGTGGAGACGTTTAGCGTACTGGTCACGGCAAGAGACGTTGTGAGTGTTGTATATGTTTCTTAGGAAAGCGGACGGCGTTACGCATGAGTAAGACGGCCTAAAGAATGAACCATGATTGATAATCTATTAATTGTTAGGTAAGGATAAGCAAAAAGGTGCTGCTGGGTCTTCAAGTAAGGGATATTCCCTCGCAGTGTGTGCATGACTCATATGGTTAACCCGTCTAAGCAAAAATCTCGACAGGAAGGTGAGTCGGCGCCATATGAGG |
||||||
|
GGCGACAGCTGGTAACACGGCTATCGGGGCCGAATTGCCGACACTTGGCGCATCGCTGGAAGTGCTTCAGATAGTTATGACGGTGAAACACGCTCCGGCACAGCCTATAGTATGTTCGTAGCGTACAAAAGCTAGGCAGCCGTAATATCAGCGCTTAATGCTTTAATTGGCATGTGTCCTATTTGTGACGTTGATGTCGATGATAATCCGCACAGAACAACTCATGCATATTCGACGAGTGTGCATGATAGACACTGCGTTGTTGCCATGTATCTCCTGAAGCACGCACACAACGAAGGTCGCGTGCTTTTTCCGG |
||||||
|
ATACTACGATCAAGCGAAAAGGGGACAATCGTTGGCAGGGTCACTAGGGCAGGGTCTTAGAGAATCAGTGCAACGTTTTAATTCGCTCACCTGTGCCGCTAAGTGTGCGATGTGTAATCTCCAATGGGAAATGAATCCTTCGGCTCGAGTAATGATTGCGTATGGTATTGGCCCAACGTAGGACTCAAGTCCCTGGTCGTAGCGCGATCTGTAATGTAAACTTACCATACGAGCAGGCTACGTTGAGGAGCGCTCGCGGAACGGTATAGAAAACGGTACGTCATATTGGCCCTTGTGACTCCTCCTCCGCTTGGAT |
||||||
|
TTTTATTACATCTGTTCGCTAATCCTGCATGGTAAACAGTTACAGGAATTGCGATACTCCACTGGGCCACCACTACTTCACTAACTGGTAAAATGCCGGTCAGCTCATAGGTCCAAATAACGTTTATGGGTGTTAGCAATGTTAGCGATGCGATCTGTAAGATCCGAATCGTATCACAGCGCGCTCCTGCCAACGTCACCTGCTCCACCTAGCACTTGTCGATAACTCCCCCTCCAATCTACAAACACCAACTCGAAAATTGACACGCGCTTCTCGGCTGTTGGTGCACTCTGATGTTAATATGATGCCATGAGCC |
||||||
|
TGCCAGATACCTACCCTCTATATTCCAAACGTGAGTGAACTAAGTTCGATTACGAACCCGTACGTGGCTGAGCACGGCTAGTACCGGCCCGATTGTACCGTTCATATTGATTAGAGTGACCGGAGCACACAATCATCGCCCCCTTAAGCTATAACTGTCCCCGGAGCCTGAGCCTTTGACAACTTCGATAGGTTAATAAAAGAGTCTCGCTAAGTAGCCATATGATGAGTGATAGGGAGGCCTAGACCTGGACAACCCCTCATTTACGTCCGAACTCGGAGTGAAGTGTAATGTGAGCTCTTAAAAGGAGCTGGAT |
||||||
|
AGGCTCTCTGATTTAGCGGTGACCGCGTCCCGATTCTCACTCCTCAGAAGGTCTGGTAGCGCTACGGGGATGGGAACCCAGACACTCGAATCGAATCGGTATGACACTAGTACAAGGGGGCCTTTAACGCACGAGAATAACACAATTCCTTCGCATACATAGCTAACAGCAGACATAGGTCTTGATAAAAACTGTGCTGCTTCCTCAGAGTCGCTAAGAAAGCATGCCAGTGCACACTGGACATTACGCCGCAGTACAGCAATTCGCGTCTCAGATCAACCTGGGGAAATAAAACGTCTTTGCGTTAGCCCTTTGT |
||||||
|
TGCCTTGATGTCCAGGCATAGGTCATGTACGGGCTACCGTCATGTCCATGTCAGAATCCGAATGATCCTCTGGATTCCGATCCCGGCAGAGGGTAACTGTGCGACCTCAGCTTCCTCATCCCGCTATCGCTCACGGCCGGTCCTAGTGCGGCATGGATAAGCTCATCCAGGATGATTTACCCAAACCCTTTCACGTGGTGGTGGGGCGCGACTGTCACGCAGGAGAGTCGCCCGAGCTGTGGGCAGGAATACTTTCCTAAAGTGTGCATGTTAGGAAGAGACGTTAACTGCGCCTCCCTATCCTATCTGAGTGGCG |
||||||
|
ATAACGACCTGTGTGTTCATCGTATCTTCTCGAACACTTATGTAGATTCGCGTGGCTACGTTGTACATTCACTCCACTCAAGAGCGAAGGGTGACGTTTTCACTCCTCGCTGGAAAACCTAGAACGGGCTGTTTTTTACGATCAAAACAAAACCACTTGATAATTGTACTATTGTCTGGTAAGCTAAGTGTGCAATCAAGATCAACTCAATCCCCTGCCACCATAGTGTGGGCACCACGTAGAGAATTCGTCGAACAGATAACGCAAACTGACAGGGAGCTTAATGAACCATCAGCCGATCACCTTCGTGAGCATC |
||||||
|
TCGCTACTGGTGCATCCTATCTATTGATATTGACAACCCGGGATTAGTGACAACCGATTTCAGACTAAACTAGTTAGTAAAGCATTTCTCTATCTCCGCCGAGTGGACGGTGATCTAAGCAAGTAGGTGCCAGGAGGCCCATAACCGCCAATGACTTTCATGATCTAATCGACGGTTCGTTTTGAGGTTGGGGTACGCTCATAACCTTTATGTTTTGGTACACGCCTGTCACCTGCGCCGTGGTATCTGAGACATTTGTCTCCTGGACTAGTTGATTCCAGTATTCACAGAACGCCGGGATACGTTTCCGTCAATA |
@ -0,0 +1,77 @@ |
|||||||
|
import jinja2 |
||||||
|
import os |
||||||
|
|
||||||
|
def main(): |
||||||
|
|
||||||
|
# Jinja env |
||||||
|
env = jinja2.Environment(loader=jinja2.FileSystemLoader('.')) |
||||||
|
|
||||||
|
problems = [ |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'a', |
||||||
|
'title': 'Implement Motif Enumeration', |
||||||
|
'description': 'Given a collection of strings of DNA, find all motifs (kmers of length k and Hamming distance d from all DNA strings).', |
||||||
|
'url': 'http://rosalind.info/problems/ba2a/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'b', |
||||||
|
'title': 'Find a Median String', |
||||||
|
'description': 'Given a kmer length k and a set of strings of DNA, find the kmer(s) that minimize the L1 norm of the distance from it to all other DNA strings.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2b/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'c', |
||||||
|
'title': 'Find a Profile-most Probable k-mer in a String', |
||||||
|
'description': 'Given a profile matrix, find the most probable k-mer to generate the given DNA string.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2c/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'd', |
||||||
|
'title': 'Implement GreedyMotifSearch', |
||||||
|
'description': 'Find a collection of motif strings using a greedy motif search. Return first-occurring profile-most probable kmer.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2d/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'e', |
||||||
|
'title': 'Implement GreedyMotifSearch with Pseudocounts', |
||||||
|
'description': 'Re-implement problem BA2d (greedy motif search) using pseudocounts, which avoid setting probabilities to an absolute value of zero.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2e/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'f', |
||||||
|
'title': 'Implement RandomizedMotifSearch with Pseudocounts', |
||||||
|
'description': 'Re-implement problem BA2e (greedy motif search with pseudocounts) but use a random, instead of greedy, algorithm to pick motif kmers from each DNA string.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2f/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '2', |
||||||
|
'problem': 'g', |
||||||
|
'title': 'Implement GibbsSampler', |
||||||
|
'description': 'Generate probabilities of each kmer in a DNA string using its profile. Use these to assemble a list of probabilities. GibbsSampler uses this random number generator to generate a random k-mer.', |
||||||
|
'url': 'http://rosalind.info/problems/ba2g/' |
||||||
|
}, |
||||||
|
] |
||||||
|
|
||||||
|
print("Writing problem boilerplate code") |
||||||
|
|
||||||
|
t = 'template.go.j2' |
||||||
|
for problem in problems: |
||||||
|
contents = env.get_template(t).render(**problem) |
||||||
|
fname = 'ba'+problem['chapter']+problem['problem']+'.go' |
||||||
|
if not os.path.exists(fname): |
||||||
|
print("Writing to file %s..."%(fname)) |
||||||
|
with open(fname,'w') as f: |
||||||
|
f.write(contents) |
||||||
|
else: |
||||||
|
print("File %s already exists, skipping..."%(fname)) |
||||||
|
|
||||||
|
print("Done") |
||||||
|
|
||||||
|
if __name__=="__main__": |
||||||
|
main() |
@ -0,0 +1,49 @@ |
|||||||
|
package rosalindchapter{{chapter}} |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info |
||||||
|
// Problem BA{{chapter}}{{problem}}: {{title}} |
||||||
|
func BA{{chapter}}{{problem}}Description() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA{{chapter}}{{problem}}:", |
||||||
|
"{{title}}", |
||||||
|
"", |
||||||
|
"{{description}}", |
||||||
|
"", |
||||||
|
"URL: {{url}}", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem |
||||||
|
func BA{{chapter}}{{problem}}(filename string) { |
||||||
|
|
||||||
|
BA{{chapter}}{{problem}}Description() |
||||||
|
|
||||||
|
// Read the contents of the input file |
||||||
|
// into a single string |
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
//// Input file contents |
||||||
|
//input := lines[0] |
||||||
|
//params := lines[1] |
||||||
|
//result := rosa.PatternCount(input, pattern) |
||||||
|
// |
||||||
|
//fmt.Println("") |
||||||
|
//fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
//fmt.Println(result) |
||||||
|
} |
||||||
|
|
@ -0,0 +1,60 @@ |
|||||||
|
package rosalindchapter3 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strconv" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA3a: Generate k-mer Composition of a String
|
||||||
|
func BA3aDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA3a:", |
||||||
|
"Generate k-mer Composition of a String", |
||||||
|
"", |
||||||
|
"Given an input string, generate a list of all kmers that are in the input string.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba3a/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA3a(filename string) { |
||||||
|
|
||||||
|
BA3aDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Input file contents
|
||||||
|
k_str := lines[0] |
||||||
|
k, err := strconv.Atoi(k_str) |
||||||
|
if err != nil { |
||||||
|
msg := fmt.Sprintf("Error: string to int conversion failed for %s\n", |
||||||
|
k_str) |
||||||
|
log.Fatalf(msg) |
||||||
|
} |
||||||
|
|
||||||
|
input := lines[1] |
||||||
|
|
||||||
|
result, _ := rosa.KmerComposition(input, k) |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
for _, kmer := range result { |
||||||
|
fmt.Printf("%s\n", kmer) |
||||||
|
} |
||||||
|
fmt.Printf("\n") |
||||||
|
} |
@ -0,0 +1,54 @@ |
|||||||
|
package rosalindchapter3 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA3b: Reconstruct string from genome path
|
||||||
|
func BA3bDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA3b:", |
||||||
|
"Reconstruct string from genome path", |
||||||
|
"", |
||||||
|
"Reconstruct a string from its genome path, i.e., sequential fragments of overlapping DNA.", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba3b/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA3b(filename string) { |
||||||
|
|
||||||
|
BA3bDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Trim each line and there are your contigs
|
||||||
|
for i, line := range lines { |
||||||
|
lines[i] = strings.Trim(line, " ") |
||||||
|
} |
||||||
|
|
||||||
|
genome, err := rosa.ReconstructGenomeFromPath(lines) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error when calling ReconstructGenomeFromPath()") |
||||||
|
} |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(genome) |
||||||
|
} |
@ -0,0 +1,54 @@ |
|||||||
|
package rosalindchapter3 |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
"strings" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info
|
||||||
|
// Problem BA3c: Construct the overlap graph of a set of k-mers
|
||||||
|
func BA3cDescription() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA3c:", |
||||||
|
"Construct the overlap graph of a set of k-mers", |
||||||
|
"", |
||||||
|
"Given a set of overlapping k-mers, construct the overlap graph and print a sorted adjacency matrix", |
||||||
|
"", |
||||||
|
"URL: http://rosalind.info/problems/ba3c/", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem
|
||||||
|
func BA3c(filename string) { |
||||||
|
|
||||||
|
BA3cDescription() |
||||||
|
|
||||||
|
// Read the contents of the input file
|
||||||
|
// into a single string
|
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Trim each line and there are your contigs
|
||||||
|
for i, line := range lines { |
||||||
|
lines[i] = strings.Trim(line, " ") |
||||||
|
} |
||||||
|
|
||||||
|
g, err := rosa.OverlapGraph(lines) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("Error when calling ReconstructGenomeFromPath()") |
||||||
|
} |
||||||
|
|
||||||
|
fmt.Println("") |
||||||
|
fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
fmt.Println(g.String()) |
||||||
|
} |
@ -0,0 +1,9 @@ |
|||||||
|
package rosalindchapter3 |
||||||
|
|
||||||
|
import "testing" |
||||||
|
|
||||||
|
func TestChapter03(t *testing.T) { |
||||||
|
BA3a("for_real/rosalind_ba3a.txt") |
||||||
|
BA3b("for_real/rosalind_ba3b.txt") |
||||||
|
BA3c("for_real/rosalind_ba3c.txt") |
||||||
|
} |
@ -0,0 +1,2 @@ |
|||||||
|
50 |
||||||
|
GGGGGAAACTTACGGAGTACAAGAAGACCCGGCACAAAGAGAAAACACGTTGCTCGTTAGCTTAAGTTAAGACGTATCGGATATCTATCGTATCCTCGTAGTATTGCTAGCCACTTCACTGGACCAGGCTTACGTATTAGCCTTATGACCCCATTTCGTCTCCGCTGCTACAGCTGTGGAGTTGACGCGTCCGGTGGGCCCTCCGTTAGCAGGTCAGCTCATATTTTCGGCAAGAAATTACCCGGAACGGACCGAAAATGGGGTACAACATGCCCACCCACAACTTAGTACACAACGCTCAGCAAGTAGCTAACGACCGCTGCCGTCGTCAGTATTAGACGCACTTAACCGTACGGAATCCGTGAGTCCTGTTTCCGCCGATCGAATTACGCGCCCGGGTCGTGGGTCCAAAGGTGGCCGATCTCACGTACTGGTGAGTCGCGCGGTCACTTGGCTGTGAGGTCCACCGGCGGCCACAGTAATCTCTGGTGCACCCAGAATCGAGTCTGGATTGTGCACAAAGCTGCCCGCCTCTATTTCTCGGACCTGGCAGAACGCAACGGATGGGTTGAAGATTGGGCCGGTTCCGATGCCCCAAAGTACCCACATTTACTAGGGTGAGGCTGTTCTTTTGAGAGTAGAGACGAAAGCACCCCGACGTAACTGCTGCACACGGGGCTGCTCGGGATACTGTGCCGGAACTAGCGAGGCTCTACCCTCATCGGAAACCAGGCCTCATAATTCTTACAGCGTACTGTGTACTCCACAAGGAGCTGACCAGACATTCCACGTCCATGGATTCGGCTCATGCATACCTCCCGATCCACTCCTGAGCACATTGGATGGACACTTGAACGATGTCCTTAGCGCACGAGACATCAATTCGTGACGGTAGATTGCTCTCACCCTGATGCGGGTAAGTCACGTATTACCCGGCGTGCGGTATGTAGTAATACAGCTATCTACAACAAATGCAACCCGGCAGGTCTCCCATAGACCA |
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,981 @@ |
|||||||
|
TGTACCTCGGGCTTTAAGGT |
||||||
|
CGGAACGGGAATCCTGGGAG |
||||||
|
ACTAGCTGCACTATCGGTTT |
||||||
|
CGATCTCCAGGAAGATGTTG |
||||||
|
AGCGGGAACTGCTGCAGGGC |
||||||
|
GGGCTCCCGCGCCAGTGCCC |
||||||
|
AATGCCTTTCTTTAGGCAGC |
||||||
|
AGAAACCGAAATCGTGCCCC |
||||||
|
TCATGGTTACAAAATTTACG |
||||||
|
CCAGGTCAAAGGGAGCTATA |
||||||
|
GCGCCCTGGTAAGTTACGCA |
||||||
|
GACAATGGCCCGTGTGAATG |
||||||
|
CGACTAGATAGATCCAATTG |
||||||
|
ATTTTGGACCGCTGTCCTTC |
||||||
|
CGACGTTCAGCGAAGAACGA |
||||||
|
ACAGCCATCAAAGGAGAGCC |
||||||
|
AGCAACAATAATTACGTCAC |
||||||
|
AAATGGTCGAAACTAGCTGC |
||||||
|
CAACCTGAATGTCGGGTCCG |
||||||
|
AAAATTTTGGACCGCTGTCC |
||||||
|
TGGTATCTCAAGAACCCTCT |
||||||
|
GGAGGGCCAACCTGAATGTC |
||||||
|
ACTTGTAACCCGTCGGAACG |
||||||
|
GCCATTTGTGCCTTTTTGGT |
||||||
|
GAACCCTCTCTCTACAGGCT |
||||||
|
GTGCCTTTTTGGTGTGGAAC |
||||||
|
AAGGTCCCACACAGCGTGCC |
||||||
|
TCTCTCTACAGGCTAGTCAT |
||||||
|
CTGCTGGAAACGACTAGATA |
||||||
|
GCTTTAAGGTTAGGGGGTAA |
||||||
|
AGCGAGGCCCTTGACACGAA |
||||||
|
GATTGAATAGAGATATTCGA |
||||||
|
CTCTCTCTACAGGCTAGTCA |
||||||
|
CATCTTTTTGCCCATTATAA |
||||||
|
CTGCAGGGCTCCCGCGCCAG |
||||||
|
CATTAACACGACGTTCAGCG |
||||||
|
TTCAGCGCCCTGGTAAGTTA |
||||||
|
GAATGTGGAAATCGATCGTA |
||||||
|
ATTCAGCGCCCTGGTAAGTT |
||||||
|
TTTGGCCGAACCAGGTCAAA |
||||||
|
ATCGGAGGAGTAGTTGGTAC |
||||||
|
CGAAATCGTGCCCCATGTTC |
||||||
|
GGTTGTGATCAGCGGCCGGG |
||||||
|
GCAGCATTCAGCGCCCTGGT |
||||||
|
GAGGTCCATTATTAACACGG |
||||||
|
CCCATTATAATCCACATGGC |
||||||
|
GTGATCAGCGGCCGGGACAA |
||||||
|
GGGCCAACCTGAATGTCGGG |
||||||
|
CGGAGGAGTAGTTGGTACCA |
||||||
|
GCGAAGAACGAATACGACAG |
||||||
|
GTCACCATTCCCAGCAAATG |
||||||
|
CAACTTGTAACCCGTCGGAA |
||||||
|
CCCGTGTGAATGTGGAAATC |
||||||
|
ACAGAGCGAGGCCCTTGACA |
||||||
|
TGTCCTTCCAGTGCTCGCGG |
||||||
|
GCAATTACGAATGACCAGGC |
||||||
|
AGAACGAATACGACAGAGCG |
||||||
|
CTAGTCATGCGGGCTCGCAC |
||||||
|
ATCCAGGAGAGATTGAATAG |
||||||
|
GAGAGTCACATACACAATGC |
||||||
|
TCCCCTGGAAGCGCGCTCCC |
||||||
|
GTTGGTACCAACGCAAGAGG |
||||||
|
AAATCGATCGTATACCGCAG |
||||||
|
CAGCCATCAAAGGAGAGCCC |
||||||
|
GTCCATTATTAACACGGGCT |
||||||
|
TTAGGCAGCATTCAGCGCCC |
||||||
|
GATCAGCGGCCGGGACAATG |
||||||
|
ATTACGAATGACCAGGCTCA |
||||||
|
GAGCCCCTTGCGCTGTATCG |
||||||
|
TGGTCACTTGCTGCTGGAAA |
||||||
|
AAAATTTACGTCCGAGCTAT |
||||||
|
GGAGTAGTTGGTACCAACGC |
||||||
|
GAAACGACTAGATAGATCCA |
||||||
|
GTGCATCCAGGAGAGATTGA |
||||||
|
GCCGAAATCACGCGCATTGT |
||||||
|
TAGCAACAATAATTACGTCA |
||||||
|
CTCAAGAACCCTCTCTCTAC |
||||||
|
GAATCCTGGGAGGGCCAACC |
||||||
|
TGCCCCATGTTCGCGTTTAA |
||||||
|
CAATGCGGCTGTCGATCTCC |
||||||
|
GGGAGCTATAACAGCCATCA |
||||||
|
TACGAATGACCAGGCTCAAT |
||||||
|
GACATCAAGGTCCCACACAG |
||||||
|
CCTTCCAGTGCTCGCGGGGT |
||||||
|
TTTCTTTAGGCAGCATTCAG |
||||||
|
TCCCAGCAAATGCCTTTCTT |
||||||
|
ACTGCTGCAGGGCTCCCGCG |
||||||
|
CGTTCAGCGAAGAACGAATA |
||||||
|
GGAACGGGAATCCTGGGAGG |
||||||
|
GGCTAGTCATGCGGGCTCGC |
||||||
|
GGTCGAAACTAGCTGCACTA |
||||||
|
TCGGCAAAGTTAGAGCGGGA |
||||||
|
CAGGTCAAAGGGAGCTATAA |
||||||
|
ATAGCAACAATAATTACGTC |
||||||
|
TAATTACGTCACCATTCCCA |
||||||
|
GCAAAAATTTTGGACCGCTG |
||||||
|
GGTCAAAGGGAGCTATAACA |
||||||
|
TAGCTTCACAAATGGTCGAA |
||||||
|
GGAACTGCTGCAGGGCTCCC |
||||||
|
CACTTGCTGCTGGAAACGAC |
||||||
|
CCTGAATGTCGGGTCCGAGT |
||||||
|
AGGTTAGGGGGTAAGGCCAT |
||||||
|
CCAGCAAATGCCTTTCTTTA |
||||||
|
GGAGAGCCCCTTGCGCTGTA |
||||||
|
CCCATTCTTCCCCTGGAAGC |
||||||
|
CCATTCCCAGCAAATGCCTT |
||||||
|
TCGAGAGTCACATACACAAT |
||||||
|
AACCCTCTCTCTACAGGCTA |
||||||
|
CATGTTCGCGTTTAAGATGA |
||||||
|
GATCCAATTGGCCGAAATCA |
||||||
|
GTAGTTGGTACCAACGCAAG |
||||||
|
AACGGGAATCCTGGGAGGGC |
||||||
|
TGGAAATCGATCGTATACCG |
||||||
|
ACCGCTGTCCTTCCAGTGCT |
||||||
|
CGAGCTATAGCAGCAAAAAT |
||||||
|
AGCCCCTTGCGCTGTATCGT |
||||||
|
GATATTCGAGAGTCACATAC |
||||||
|
CGAGTTTGGCCGAACCAGGT |
||||||
|
ACCAGGTCAAAGGGAGCTAT |
||||||
|
CAAAGGTGGTTGTGATCAGC |
||||||
|
TCAGCGGCCGGGACAATGGC |
||||||
|
GGTAAGGCCATTTGTGCCTT |
||||||
|
GAAGTTCGGCAAAGTTAGAG |
||||||
|
AAAAATGGTATCTCAAGAAC |
||||||
|
AAACGACTAGATAGATCCAA |
||||||
|
ATCGTAGCTTCACAAATGGT |
||||||
|
TGAGTGACATCAAGGTCCCA |
||||||
|
CATTCTTCCCCTGGAAGCGC |
||||||
|
GGTCCCACACAGCGTGCCAT |
||||||
|
CAATTACGAATGACCAGGCT |
||||||
|
CTTGTAACCCGTCGGAACGG |
||||||
|
TGATCAGCGGCCGGGACAAT |
||||||
|
ATCCACATGGCTATAGCAAC |
||||||
|
GGCTGTCGATCTCCAGGAAG |
||||||
|
CACATGGCTATAGCAACAAT |
||||||
|
TTAAGATGAGTGACATCAAG |
||||||
|
GAGATTGAATAGAGATATTC |
||||||
|
CCTGGGAGGGCCAACCTGAA |
||||||
|
GGGCTTTAAGGTTAGGGGGT |
||||||
|
AGAGCGAGGCCCTTGACACG |
||||||
|
CCCACACAGCGTGCCATTCA |
||||||
|
TGGCTATAGCAACAATAATT |
||||||
|
CCTGGTAAGTTACGCATTAA |
||||||
|
GCCCTGGTAAGTTACGCATT |
||||||
|
CCCGTCGGAACGGGAATCCT |
||||||
|
ACGTTCAGCGAAGAACGAAT |
||||||
|
AAGTTACGCATTAACACGAC |
||||||
|
GCATCCAGGAGAGATTGAAT |
||||||
|
GGTTTGGTCACTTGCTGCTG |
||||||
|
CCCGCGCCAGTGCCCATCTC |
||||||
|
CCGCTGTCCTTCCAGTGCTC |
||||||
|
GTGTACCTCGGGCTTTAAGG |
||||||
|
ACAAAGGTGGTTGTGATCAG |
||||||
|
CAGTGCTCGCGGGGTGTACC |
||||||
|
AATCGTGCCCCATGTTCGCG |
||||||
|
CCTTTATCTCGTGCATCCAG |
||||||
|
GGGAACTGCTGCAGGGCTCC |
||||||
|
GAAATCGTGCCCCATGTTCG |
||||||
|
TAACACGGGCTCATCTTTTT |
||||||
|
AACACTAAACAAAGGTGGTT |
||||||
|
ATGTTCGCGTTTAAGATGAG |
||||||
|
GCTGCTGGAAACGACTAGAT |
||||||
|
GTATCTCAAGAACCCTCTCT |
||||||
|
CCAACGCAAGAGGTCCATTA |
||||||
|
CACGGGCTCATCTTTTTGCC |
||||||
|
TGCTGCTGGAAACGACTAGA |
||||||
|
TTGTTGTCGTCATGGTTACA |
||||||
|
CGCACGCGGTCAACTTGTAA |
||||||
|
GCCGGGACAATGGCCCGTGT |
||||||
|
GGAATCCTGGGAGGGCCAAC |
||||||
|
TAGATAGATCCAATTGGCCG |
||||||
|
AATGGCCCGTGTGAATGTGG |
||||||
|
CTTCACAAATGGTCGAAACT |
||||||
|
AGAGCGGGAACTGCTGCAGG |
||||||
|
TCCAGTGCTCGCGGGGTGTA |
||||||
|
ACACAGCGTGCCATTCATCC |
||||||
|
TCTTTAGGCAGCATTCAGCG |
||||||
|
CCAGTGCCCATCTCGGCGAA |
||||||
|
TACAGGCTAGTCATGCGGGC |
||||||
|
GTCGGAACGGGAATCCTGGG |
||||||
|
TCGGAACGGGAATCCTGGGA |
||||||
|
GAGTAGTTGGTACCAACGCA |
||||||
|
CCCTGGAAGCGCGCTCCCGC |
||||||
|
AACAATAATTACGTCACCAT |
||||||
|
CTGAATGTCGGGTCCGAGTT |
||||||
|
CGCGTTTAAGATGAGTGACA |
||||||
|
TCGGGCTTTAAGGTTAGGGG |
||||||
|
ATACCGCAGAAGTTCGGCAA |
||||||
|
ACGAATACGACAGAGCGAGG |
||||||
|
TGGTTACAAAATTTACGTCC |
||||||
|
CGCATTAACACGACGTTCAG |
||||||
|
GCGAAGAAACCGAAATCGTG |
||||||
|
ACTAAACAAAGGTGGTTGTG |
||||||
|
TTGCCCATTATAATCCACAT |
||||||
|
CCGGGACAATGGCCCGTGTG |
||||||
|
TCCAGGAGAGATTGAATAGA |
||||||
|
TAATCCACATGGCTATAGCA |
||||||
|
AATGGTCGAAACTAGCTGCA |
||||||
|
AAGAACGAATACGACAGAGC |
||||||
|
AATCCTGGGAGGGCCAACCT |
||||||
|
CGAAGAACGAATACGACAGA |
||||||
|
AATGCGGCTGTCGATCTCCA |
||||||
|
AGATGAGTGACATCAAGGTC |
||||||
|
ATCTCAAGAACCCTCTCTCT |
||||||
|
GTGCCCATCTCGGCGAAAAA |
||||||
|
ATAGATCCAATTGGCCGAAA |
||||||
|
AGGCCCTTGACACGAACACT |
||||||
|
CACAAATGGTCGAAACTAGC |
||||||
|
ATTGTTGTCGTCATGGTTAC |
||||||
|
TGTCGATCTCCAGGAAGATG |
||||||
|
GTTCGGCAAAGTTAGAGCGG |
||||||
|
GTCAAAGGGAGCTATAACAG |
||||||
|
CACGACGTTCAGCGAAGAAC |
||||||
|
CGTGCCCCATGTTCGCGTTT |
||||||
|
TGGACCGCTGTCCTTCCAGT |
||||||
|
TCTCGTGCATCCAGGAGAGA |
||||||
|
AGTGACATCAAGGTCCCACA |
||||||
|
CTCTACAGGCTAGTCATGCG |
||||||
|
CAGCGGCCGGGACAATGGCC |
||||||
|
CGTCCGAGCTATAGCAGCAA |
||||||
|
GCGGCCGGGACAATGGCCCG |
||||||
|
GAAGAAACCGAAATCGTGCC |
||||||
|
AGGAGAGCCCCTTGCGCTGT |
||||||
|
CAAGGTCCCACACAGCGTGC |
||||||
|
GTGCTCGCGGGGTGTACCTC |
||||||
|
CTTCCCCTGGAAGCGCGCTC |
||||||
|
GGCTCAATCGGAGGAGTAGT |
||||||
|
GAACCCATTCTTCCCCTGGA |
||||||
|
GTATACCGCAGAAGTTCGGC |
||||||
|
GAATGACCAGGCTCAATCGG |
||||||
|
ATAACAGCCATCAAAGGAGA |
||||||
|
GAGCTATAACAGCCATCAAA |
||||||
|
GGTCCGAGTTTGGCCGAACC |
||||||
|
CCCCTTGCGCTGTATCGTAG |
||||||
|
CTGGTAAGTTACGCATTAAC |
||||||
|
CCTTGACACGAACACTAAAC |
||||||
|
AGATCCAATTGGCCGAAATC |
||||||
|
AGTGCTCGCGGGGTGTACCT |
||||||
|
GTCAACTTGTAACCCGTCGG |
||||||
|
TACGCATTAACACGACGTTC |
||||||
|
ACAAAATTTACGTCCGAGCT |
||||||
|
GCAACAATAATTACGTCACC |
||||||
|
CAATTGGCCGAAATCACGCG |
||||||
|
ACCCGTCGGAACGGGAATCC |
||||||
|
GGTTACAAAATTTACGTCCG |
||||||
|
ACGACAGAGCGAGGCCCTTG |
||||||
|
ATTGAATAGAGATATTCGAG |
||||||
|
TCATGCGGGCTCGCACGCGG |
||||||
|
GCAAAGTTAGAGCGGGAACT |
||||||
|
TGTATCGTAGCTTCACAAAT |
||||||
|
CCCTCTCTCTACAGGCTAGT |
||||||
|
TCTACAGGCTAGTCATGCGG |
||||||
|
GATAGATCCAATTGGCCGAA |
||||||
|
CCGAAATCGTGCCCCATGTT |
||||||
|
TTTAAGGTTAGGGGGTAAGG |
||||||
|
CCGCCTTTATCTCGTGCATC |
||||||
|
GGCCGGGACAATGGCCCGTG |
||||||
|
AATGTGGAAATCGATCGTAT |
||||||
|
ACACGGGCTCATCTTTTTGC |
||||||
|
GAATGTCGGGTCCGAGTTTG |
||||||
|
TGGAAGCGCGCTCCCGCCTT |
||||||
|
GCTCCCGCCTTTATCTCGTG |
||||||
|
CGCCCTGGTAAGTTACGCAT |
||||||
|
GTCGTCATGGTTACAAAATT |
||||||
|
GCGGTCAACTTGTAACCCGT |
||||||
|
ATCGTGCCCCATGTTCGCGT |
||||||
|
GAAACCGAAATCGTGCCCCA |
||||||
|
TGTGGTGCAATTACGAATGA |
||||||
|
CGAAATCACGCGCATTGTTG |
||||||
|
AACCTGAATGTCGGGTCCGA |
||||||
|
CGCCTTTATCTCGTGCATCC |
||||||
|
GTTTAAGATGAGTGACATCA |
||||||
|
ATAATTACGTCACCATTCCC |
||||||
|
AGGAGTAGTTGGTACCAACG |
||||||
|
GGTTAGGGGGTAAGGCCATT |
||||||
|
TCATCTTTTTGCCCATTATA |
||||||
|
ACAAATGGTCGAAACTAGCT |
||||||
|
TCGCGGGGTGTACCTCGGGC |
||||||
|
CGGGCTCATCTTTTTGCCCA |
||||||
|
GCCATCAAAGGAGAGCCCCT |
||||||
|
TTTACGTCCGAGCTATAGCA |
||||||
|
TGCGCTGTATCGTAGCTTCA |
||||||
|
ACGACTAGATAGATCCAATT |
||||||
|
ACAATGCGGCTGTCGATCTC |
||||||
|
GTGAATGTGGAAATCGATCG |
||||||
|
CCGAAATCACGCGCATTGTT |
||||||
|
TGGTGCGAAGAAACCGAAAT |
||||||
|
CTATCGGTTTGGTCACTTGC |
||||||
|
CCCAGCAAATGCCTTTCTTT |
||||||
|
ACCCTCTCTCTACAGGCTAG |
||||||
|
GCTGCAGGGCTCCCGCGCCA |
||||||
|
TGCGGCTGTCGATCTCCAGG |
||||||
|
CCAGGCTCAATCGGAGGAGT |
||||||
|
TTCGAGAGTCACATACACAA |
||||||
|
AATTACGTCACCATTCCCAG |
||||||
|
GGTAAGTTACGCATTAACAC |
||||||
|
CTGTGGTGCAATTACGAATG |
||||||
|
TGTTGGTGCGAAGAAACCGA |
||||||
|
AGGGCCAACCTGAATGTCGG |
||||||
|
AAGGTTAGGGGGTAAGGCCA |
||||||
|
AATTTACGTCCGAGCTATAG |
||||||
|
ATACACAATGCGGCTGTCGA |
||||||
|
TCGTGCCCCATGTTCGCGTT |
||||||
|
CAGAAGTTCGGCAAAGTTAG |
||||||
|
TTGAATAGAGATATTCGAGA |
||||||
|
TCAGCGCCCTGGTAAGTTAC |
||||||
|
AAATCGTGCCCCATGTTCGC |
||||||
|
ATGCCTTTCTTTAGGCAGCA |
||||||
|
TGTCGTCATGGTTACAAAAT |
||||||
|
CCATGTTCGCGTTTAAGATG |
||||||
|
GCATTCAGCGCCCTGGTAAG |
||||||
|
CCCATGTTCGCGTTTAAGAT |
||||||
|
ATCCTGGGAGGGCCAACCTG |
||||||
|
AACACGACGTTCAGCGAAGA |
||||||
|
GAATAGAGATATTCGAGAGT |
||||||
|
GTACCTCGGGCTTTAAGGTT |
||||||
|
CAGGCTCAATCGGAGGAGTA |
||||||
|
TTTAGGCAGCATTCAGCGCC |
||||||
|
ATTCTTCCCCTGGAAGCGCG |
||||||
|
ACGAATGACCAGGCTCAATC |
||||||
|
GGCTCATCTTTTTGCCCATT |
||||||
|
GAGCGAGGCCCTTGACACGA |
||||||
|
AATGACCAGGCTCAATCGGA |
||||||
|
AAGAGGTCCATTATTAACAC |
||||||
|
GGGTGTACCTCGGGCTTTAA |
||||||
|
TTTTTGGTGTGGAACCCATT |
||||||
|
TGACCAGGCTCAATCGGAGG |
||||||
|
ACACTAAACAAAGGTGGTTG |
||||||
|
GGGGTGTACCTCGGGCTTTA |
||||||
|
ACGTCACCATTCCCAGCAAA |
||||||
|
CGAAACTAGCTGCACTATCG |
||||||
|
CCTCGGGCTTTAAGGTTAGG |
||||||
|
TTAACACGGGCTCATCTTTT |
||||||
|
AAATGGTATCTCAAGAACCC |
||||||
|
TGACATCAAGGTCCCACACA |
||||||
|
GAGGCCCTTGACACGAACAC |
||||||
|
CTTGCGCTGTATCGTAGCTT |
||||||
|
GTCCGAGTTTGGCCGAACCA |
||||||
|
GGAGGAGTAGTTGGTACCAA |
||||||
|
TTCCAGTGCTCGCGGGGTGT |
||||||
|
GAACGAATACGACAGAGCGA |
||||||
|
CGGGTCCGAGTTTGGCCGAA |
||||||
|
AATTACGAATGACCAGGCTC |
||||||
|
GTCGGGTCCGAGTTTGGCCG |
||||||
|
TATAGCAGCAAAAATTTTGG |
||||||
|
AGTAGTTGGTACCAACGCAA |
||||||
|
TGCTGGAAACGACTAGATAG |
||||||
|
TGGCCCGTGTGAATGTGGAA |
||||||
|
TGTTCGCGTTTAAGATGAGT |
||||||
|
AGCGGCCGGGACAATGGCCC |
||||||
|
ATTCCCAGCAAATGCCTTTC |
||||||
|
TACCTCGGGCTTTAAGGTTA |
||||||
|
AGTCATGCGGGCTCGCACGC |
||||||
|
GGCAGCATTCAGCGCCCTGG |
||||||
|
CGAGAGTCACATACACAATG |
||||||
|
GTGTGGAACCCATTCTTCCC |
||||||
|
CATCCAGGAGAGATTGAATA |
||||||
|
GGCCGAAATCACGCGCATTG |
||||||
|
TTGACACGAACACTAAACAA |
||||||
|
TCTCAAGAACCCTCTCTCTA |
||||||
|
TGCATCCAGGAGAGATTGAA |
||||||
|
ATCTCCAGGAAGATGTTGGT |
||||||
|
CACATACACAATGCGGCTGT |
||||||
|
TTGGTGCGAAGAAACCGAAA |
||||||
|
AGTGCCCATCTCGGCGAAAA |
||||||
|
GCGCTCCCGCCTTTATCTCG |
||||||
|
TAGATCCAATTGGCCGAAAT |
||||||
|
GGCCCTTGACACGAACACTA |
||||||
|
CAAAAATTTTGGACCGCTGT |
||||||
|
CATCAAGGTCCCACACAGCG |
||||||
|
CAGCAAATGCCTTTCTTTAG |
||||||
|
CATACACAATGCGGCTGTCG |
||||||
|
ATCGTATACCGCAGAAGTTC |
||||||
|
GAATACGACAGAGCGAGGCC |
||||||
|
TCAACTTGTAACCCGTCGGA |
||||||
|
TGTAACCCGTCGGAACGGGA |
||||||
|
TGCAATTACGAATGACCAGG |
||||||
|
TCGTAGCTTCACAAATGGTC |
||||||
|
TCCATTATTAACACGGGCTC |
||||||
|
TTGGCCGAAATCACGCGCAT |
||||||
|
CCTCTCTCTACAGGCTAGTC |
||||||
|
ATTGGCCGAAATCACGCGCA |
||||||
|
GCTAGTCATGCGGGCTCGCA |
||||||
|
AGGTCCCACACAGCGTGCCA |
||||||
|
GGAAATCGATCGTATACCGC |
||||||
|
CGCGCCAGTGCCCATCTCGG |
||||||
|
TCGATCGTATACCGCAGAAG |
||||||
|
CCAATTGGCCGAAATCACGC |
||||||
|
ATACGACAGAGCGAGGCCCT |
||||||
|
CGCTCCCGCCTTTATCTCGT |
||||||
|
TGGCCGAAATCACGCGCATT |
||||||
|
AATAATTACGTCACCATTCC |
||||||
|
CAGCGAAGAACGAATACGAC |
||||||
|
TACGACAGAGCGAGGCCCTT |
||||||
|
GAAGCGCGCTCCCGCCTTTA |
||||||
|
TCTTCCCCTGGAAGCGCGCT |
||||||
|
GAGTGACATCAAGGTCCCAC |
||||||
|
GCGGGCTCGCACGCGGTCAA |
||||||
|
CGGCTGTCGATCTCCAGGAA |
||||||
|
CCGTCGGAACGGGAATCCTG |
||||||
|
GTAAGTTACGCATTAACACG |
||||||
|
ATCTTTTTGCCCATTATAAT |
||||||
|
AATACGACAGAGCGAGGCCC |
||||||
|
GACACGAACACTAAACAAAG |
||||||
|
GTGTGAATGTGGAAATCGAT |
||||||
|
TGCACTATCGGTTTGGTCAC |
||||||
|
GTCATGCGGGCTCGCACGCG |
||||||
|
ACCCATTCTTCCCCTGGAAG |
||||||
|
ATCTCGTGCATCCAGGAGAG |
||||||
|
AAGTTAGAGCGGGAACTGCT |
||||||
|
CTATAGCAACAATAATTACG |
||||||
|
GTGCAATTACGAATGACCAG |
||||||
|
CAGGCTAGTCATGCGGGCTC |
||||||
|
ATAATCCACATGGCTATAGC |
||||||
|
GCTATAACAGCCATCAAAGG |
||||||
|
AGATGTTGGTGCGAAGAAAC |
||||||
|
ATGGTCGAAACTAGCTGCAC |
||||||
|
GGAGAGATTGAATAGAGATA |
||||||
|
CGACAGAGCGAGGCCCTTGA |
||||||
|
TCGCACGCGGTCAACTTGTA |
||||||
|
GAAATCACGCGCATTGTTGT |
||||||
|
TATACCGCAGAAGTTCGGCA |
||||||
|
CAGCATTCAGCGCCCTGGTA |
||||||
|
TTTTGGTGTGGAACCCATTC |
||||||
|
TATAATCCACATGGCTATAG |
||||||
|
ATCACGCGCATTGTTGTCGT |
||||||
|
ATTATAATCCACATGGCTAT |
||||||
|
TGTCGGGTCCGAGTTTGGCC |
||||||
|
AAGTTCGGCAAAGTTAGAGC |
||||||
|
ACGGGAATCCTGGGAGGGCC |
||||||
|
ATTTACGTCCGAGCTATAGC |
||||||
|
CTGGAAGCGCGCTCCCGCCT |
||||||
|
GCAAATGCCTTTCTTTAGGC |
||||||
|
TTACGCATTAACACGACGTT |
||||||
|
AGGTGGTTGTGATCAGCGGC |
||||||
|
GCCCGTGTGAATGTGGAAAT |
||||||
|
GTCCGAGCTATAGCAGCAAA |
||||||
|
AGGTCAAAGGGAGCTATAAC |
||||||
|
GCGGCTGTCGATCTCCAGGA |
||||||
|
TTGCGCTGTATCGTAGCTTC |
||||||
|
AAAGTTAGAGCGGGAACTGC |
||||||
|
TTCCCCTGGAAGCGCGCTCC |
||||||
|
CCCCATGTTCGCGTTTAAGA |
||||||
|
TATCTCAAGAACCCTCTCTC |
||||||
|
CCGCGCCAGTGCCCATCTCG |
||||||
|
GGAACCCATTCTTCCCCTGG |
||||||
|
ATAGCAGCAAAAATTTTGGA |
||||||
|
CGTTTAAGATGAGTGACATC |
||||||
|
TATAACAGCCATCAAAGGAG |
||||||
|
CCAGGAAGATGTTGGTGCGA |
||||||
|
TACGTCACCATTCCCAGCAA |
||||||
|
GCCTTTCTTTAGGCAGCATT |
||||||
|
AACAGCCATCAAAGGAGAGC |
||||||
|
GCATTAACACGACGTTCAGC |
||||||
|
CTCCCGCCTTTATCTCGTGC |
||||||
|
CGTGCATCCAGGAGAGATTG |
||||||
|
CTGGAAACGACTAGATAGAT |
||||||
|
GCATTGTTGTCGTCATGGTT |
||||||
|
CGGCCGGGACAATGGCCCGT |
||||||
|
TTATTAACACGGGCTCATCT |
||||||
|
TAGCAGCAAAAATTTTGGAC |
||||||
|
TCAAGAACCCTCTCTCTACA |
||||||
|
AGTCACATACACAATGCGGC |
||||||
|
CAACGCAAGAGGTCCATTAT |
||||||
|
CGGGCTCGCACGCGGTCAAC |
||||||
|
CTCGCGGGGTGTACCTCGGG |
||||||
|
ATGTGGAAATCGATCGTATA |
||||||
|
AGCTATAACAGCCATCAAAG |
||||||
|
TTAAGGTTAGGGGGTAAGGC |
||||||
|
TCCAATTGGCCGAAATCACG |
||||||
|
AACGACTAGATAGATCCAAT |
||||||
|
GGCCAACCTGAATGTCGGGT |
||||||
|
CCTTTCTTTAGGCAGCATTC |
||||||
|
AGTTGGTACCAACGCAAGAG |
||||||
|
TGTGAATGTGGAAATCGATC |
||||||
|
CATGGTTACAAAATTTACGT |
||||||
|
TTGCTGCTGGAAACGACTAG |
||||||
|
CATGCGGGCTCGCACGCGGT |
||||||
|
CAACAATAATTACGTCACCA |
||||||
|
GAGAGATTGAATAGAGATAT |
||||||
|
GTGCGAAGAAACCGAAATCG |
||||||
|
GGGCTCGCACGCGGTCAACT |
||||||
|
GCAAGAGGTCCATTATTAAC |
||||||
|
GTCACTTGCTGCTGGAAACG |
||||||
|
GGCCGAACCAGGTCAAAGGG |
||||||
|
CGCGCATTGTTGTCGTCATG |
||||||
|
CATTATAATCCACATGGCTA |
||||||
|
GGTGCAATTACGAATGACCA |
||||||
|
GGGAATCCTGGGAGGGCCAA |
||||||
|
CATCTCGGCGAAAAATGGTA |
||||||
|
CGTGTGAATGTGGAAATCGA |
||||||
|
GCTATAGCAACAATAATTAC |
||||||
|
CCGCAGAAGTTCGGCAAAGT |
||||||
|
ACTAGATAGATCCAATTGGC |
||||||
|
AAACAAAGGTGGTTGTGATC |
||||||
|
TCGCGTTTAAGATGAGTGAC |
||||||
|
TCGGGTCCGAGTTTGGCCGA |
||||||
|
AAAATGGTATCTCAAGAACC |
||||||
|
GGACAATGGCCCGTGTGAAT |
||||||
|
GACCAGGCTCAATCGGAGGA |
||||||
|
CAATAATTACGTCACCATTC |
||||||
|
ATCGGTTTGGTCACTTGCTG |
||||||
|
TCGGCGAAAAATGGTATCTC |
||||||
|
AACCCATTCTTCCCCTGGAA |
||||||
|
TCCCGCCTTTATCTCGTGCA |
||||||
|
GATCTCCAGGAAGATGTTGG |
||||||
|
GAAAAATGGTATCTCAAGAA |
||||||
|
GGCCCGTGTGAATGTGGAAA |
||||||
|
GTCCTTCCAGTGCTCGCGGG |
||||||
|
TTGTAACCCGTCGGAACGGG |
||||||
|
ACGCAAGAGGTCCATTATTA |
||||||
|
GCTCGCGGGGTGTACCTCGG |
||||||
|
TAAGGTTAGGGGGTAAGGCC |
||||||
|
CCCCTGGAAGCGCGCTCCCG |
||||||
|
GCTATAGCAGCAAAAATTTT |
||||||
|
TCTCGGCGAAAAATGGTATC |
||||||
|
ATCCAATTGGCCGAAATCAC |
||||||
|
ATGGCCCGTGTGAATGTGGA |
||||||
|
GCCCCTTGCGCTGTATCGTA |
||||||
|
TTGTCGTCATGGTTACAAAA |
||||||
|
AACCGAAATCGTGCCCCATG |
||||||
|
CCAGTGCTCGCGGGGTGTAC |
||||||
|
CCATTATAATCCACATGGCT |
||||||
|
ATGTTGGTGCGAAGAAACCG |
||||||
|
ATGGTTACAAAATTTACGTC |
||||||
|
AGTTTGGCCGAACCAGGTCA |
||||||
|
GTACCAACGCAAGAGGTCCA |
||||||
|
GAGGAGTAGTTGGTACCAAC |
||||||
|
CAAAGTTAGAGCGGGAACTG |
||||||
|
TTAGGGGGTAAGGCCATTTG |
||||||
|
GAAGATGTTGGTGCGAAGAA |
||||||
|
ATGGCTATAGCAACAATAAT |
||||||
|
AAATCACGCGCATTGTTGTC |
||||||
|
TATTCGAGAGTCACATACAC |
||||||
|
AAATGCCTTTCTTTAGGCAG |
||||||
|
CGAATACGACAGAGCGAGGC |
||||||
|
AGATAGATCCAATTGGCCGA |
||||||
|
TGACACGAACACTAAACAAA |
||||||
|
GCCAACCTGAATGTCGGGTC |
||||||
|
CTGCTGCAGGGCTCCCGCGC |
||||||
|
CTCGTGCATCCAGGAGAGAT |
||||||
|
TTTGCCCATTATAATCCACA |
||||||
|
ACGAACACTAAACAAAGGTG |
||||||
|
TGCAGGGCTCCCGCGCCAGT |
||||||
|
AGAAGTTCGGCAAAGTTAGA |
||||||
|
TAGAGATATTCGAGAGTCAC |
||||||
|
CTGCACTATCGGTTTGGTCA |
||||||
|
GGCCATTTGTGCCTTTTTGG |
||||||
|
CACACAGCGTGCCATTCATC |
||||||
|
CAAATGGTCGAAACTAGCTG |
||||||
|
TTACAAAATTTACGTCCGAG |
||||||
|
AACCAGGTCAAAGGGAGCTA |
||||||
|
GCTCAATCGGAGGAGTAGTT |
||||||
|
CCCTGGTAAGTTACGCATTA |
||||||
|
CCGAACCAGGTCAAAGGGAG |
||||||
|
TCCTTCCAGTGCTCGCGGGG |
||||||
|
GCGGGAACTGCTGCAGGGCT |
||||||
|
ACACGAACACTAAACAAAGG |
||||||
|
GGGACAATGGCCCGTGTGAA |
||||||
|
TCAATCGGAGGAGTAGTTGG |
||||||
|
TAAGATGAGTGACATCAAGG |
||||||
|
CCACATGGCTATAGCAACAA |
||||||
|
TGAATAGAGATATTCGAGAG |
||||||
|
AGAGATTGAATAGAGATATT |
||||||
|
ACGGGCTCATCTTTTTGCCC |
||||||
|
CTTTTTGCCCATTATAATCC |
||||||
|
CTCGCACGCGGTCAACTTGT |
||||||
|
CTTCCAGTGCTCGCGGGGTG |
||||||
|
CTCATCTTTTTGCCCATTAT |
||||||
|
AGGCCATTTGTGCCTTTTTG |
||||||
|
AATTTTGGACCGCTGTCCTT |
||||||
|
ATTTGTGCCTTTTTGGTGTG |
||||||
|
ACACGACGTTCAGCGAAGAA |
||||||
|
TTGGTCACTTGCTGCTGGAA |
||||||
|
GTATCGTAGCTTCACAAATG |
||||||
|
ACGCGCATTGTTGTCGTCAT |
||||||
|
CCCTTGCGCTGTATCGTAGC |
||||||
|
GGAAGCGCGCTCCCGCCTTT |
||||||
|
CGTATACCGCAGAAGTTCGG |
||||||
|
AAAGGAGAGCCCCTTGCGCT |
||||||
|
GATGAGTGACATCAAGGTCC |
||||||
|
CGCTGTATCGTAGCTTCACA |
||||||
|
TCACATACACAATGCGGCTG |
||||||
|
AAGAACCCTCTCTCTACAGG |
||||||
|
CAGAGCGAGGCCCTTGACAC |
||||||
|
CCACACAGCGTGCCATTCAT |
||||||
|
GGTCAACTTGTAACCCGTCG |
||||||
|
TATCTCGTGCATCCAGGAGA |
||||||
|
AAGGAGAGCCCCTTGCGCTG |
||||||
|
GGTCCATTATTAACACGGGC |
||||||
|
GTCGATCTCCAGGAAGATGT |
||||||
|
TTAGAGCGGGAACTGCTGCA |
||||||
|
AAAGGTGGTTGTGATCAGCG |
||||||
|
TGGTTGTGATCAGCGGCCGG |
||||||
|
GTTAGAGCGGGAACTGCTGC |
||||||
|
CCAACCTGAATGTCGGGTCC |
||||||
|
GTTCAGCGAAGAACGAATAC |
||||||
|
CTCCCGCGCCAGTGCCCATC |
||||||
|
TCGTGCATCCAGGAGAGATT |
||||||
|
CGCGGGGTGTACCTCGGGCT |
||||||
|
CACTAAACAAAGGTGGTTGT |
||||||
|
CCCTTGACACGAACACTAAA |
||||||
|
CGGTTTGGTCACTTGCTGCT |
||||||
|
ACTATCGGTTTGGTCACTTG |
||||||
|
TGGAAACGACTAGATAGATC |
||||||
|
GTGGTGCAATTACGAATGAC |
||||||
|
TGGTGCAATTACGAATGACC |
||||||
|
GACGTTCAGCGAAGAACGAA |
||||||
|
AATCGATCGTATACCGCAGA |
||||||
|
TTTGTGCCTTTTTGGTGTGG |
||||||
|
CATTCAGCGCCCTGGTAAGT |
||||||
|
GCGCCAGTGCCCATCTCGGC |
||||||
|
GTAACCCGTCGGAACGGGAA |
||||||
|
AACTGCTGCAGGGCTCCCGC |
||||||
|
AGAGATATTCGAGAGTCACA |
||||||
|
GTCCCACACAGCGTGCCATT |
||||||
|
AGCTATAGCAGCAAAAATTT |
||||||
|
TTTTGGACCGCTGTCCTTCC |
||||||
|
TCGGAGGAGTAGTTGGTACC |
||||||
|
CTAGCTGCACTATCGGTTTG |
||||||
|
TCCAGGAAGATGTTGGTGCG |
||||||
|
AGATTGAATAGAGATATTCG |
||||||
|
CAGGAAGATGTTGGTGCGAA |
||||||
|
TCAAAGGAGAGCCCCTTGCG |
||||||
|
CCATTATTAACACGGGCTCA |
||||||
|
AAGGGAGCTATAACAGCCAT |
||||||
|
CTATAACAGCCATCAAAGGA |
||||||
|
AAGGTGGTTGTGATCAGCGG |
||||||
|
GAGGGCCAACCTGAATGTCG |
||||||
|
GCGAAAAATGGTATCTCAAG |
||||||
|
TTATAATCCACATGGCTATA |
||||||
|
AGGCTCAATCGGAGGAGTAG |
||||||
|
TGCGGGCTCGCACGCGGTCA |
||||||
|
TCACTTGCTGCTGGAAACGA |
||||||
|
AGGGAGCTATAACAGCCATC |
||||||
|
GGACCGCTGTCCTTCCAGTG |
||||||
|
TTCACAAATGGTCGAAACTA |
||||||
|
TAACACGACGTTCAGCGAAG |
||||||
|
CCGTGTGAATGTGGAAATCG |
||||||
|
CGGGCTTTAAGGTTAGGGGG |
||||||
|
TTGTGATCAGCGGCCGGGAC |
||||||
|
GTGACATCAAGGTCCCACAC |
||||||
|
TGGCCGAACCAGGTCAAAGG |
||||||
|
GACCGCTGTCCTTCCAGTGC |
||||||
|
TTCTTCCCCTGGAAGCGCGC |
||||||
|
GTCGAAACTAGCTGCACTAT |
||||||
|
TACGTCCGAGCTATAGCAGC |
||||||
|
CCTTGCGCTGTATCGTAGCT |
||||||
|
CGGTCAACTTGTAACCCGTC |
||||||
|
CAGTGCCCATCTCGGCGAAA |
||||||
|
GCTTCACAAATGGTCGAAAC |
||||||
|
GGTGGTTGTGATCAGCGGCC |
||||||
|
TGTGCCTTTTTGGTGTGGAA |
||||||
|
GGCTCCCGCGCCAGTGCCCA |
||||||
|
TGTGGAACCCATTCTTCCCC |
||||||
|
TCTTTTTGCCCATTATAATC |
||||||
|
CTTTATCTCGTGCATCCAGG |
||||||
|
AACGAATACGACAGAGCGAG |
||||||
|
GATGTTGGTGCGAAGAAACC |
||||||
|
TGCCCATTATAATCCACATG |
||||||
|
AATCGGAGGAGTAGTTGGTA |
||||||
|
GGGTAAGGCCATTTGTGCCT |
||||||
|
GTTTGGCCGAACCAGGTCAA |
||||||
|
TCTCTACAGGCTAGTCATGC |
||||||
|
GAAGAACGAATACGACAGAG |
||||||
|
CGCGCTCCCGCCTTTATCTC |
||||||
|
AATCCACATGGCTATAGCAA |
||||||
|
CTTTTTGGTGTGGAACCCAT |
||||||
|
AGCGCGCTCCCGCCTTTATC |
||||||
|
GAGCTATAGCAGCAAAAATT |
||||||
|
AGCAAAAATTTTGGACCGCT |
||||||
|
TAGTTGGTACCAACGCAAGA |
||||||
|
CAAAGGAGAGCCCCTTGCGC |
||||||
|
ACAGGCTAGTCATGCGGGCT |
||||||
|
GAGAGCCCCTTGCGCTGTAT |
||||||
|
GTTTGGTCACTTGCTGCTGG |
||||||
|
ATTATTAACACGGGCTCATC |
||||||
|
GAAACTAGCTGCACTATCGG |
||||||
|
GCTGTCCTTCCAGTGCTCGC |
||||||
|
TAAACAAAGGTGGTTGTGAT |
||||||
|
CGTCACCATTCCCAGCAAAT |
||||||
|
TCCGAGCTATAGCAGCAAAA |
||||||
|
GGTGTGGAACCCATTCTTCC |
||||||
|
ACCGCAGAAGTTCGGCAAAG |
||||||
|
GGTATCTCAAGAACCCTCTC |
||||||
|
CGCTGTCCTTCCAGTGCTCG |
||||||
|
CATTTGTGCCTTTTTGGTGT |
||||||
|
CTCTCTACAGGCTAGTCATG |
||||||
|
CAGCGCCCTGGTAAGTTACG |
||||||
|
GCCTTTATCTCGTGCATCCA |
||||||
|
ACTTGCTGCTGGAAACGACT |
||||||
|
AACCCGTCGGAACGGGAATC |
||||||
|
CCGAGTTTGGCCGAACCAGG |
||||||
|
CTGGGAGGGCCAACCTGAAT |
||||||
|
GGTCACTTGCTGCTGGAAAC |
||||||
|
CTCCAGGAAGATGTTGGTGC |
||||||
|
CGCGGTCAACTTGTAACCCG |
||||||
|
CACTATCGGTTTGGTCACTT |
||||||
|
GTCATGGTTACAAAATTTAC |
||||||
|
AAGATGTTGGTGCGAAGAAA |
||||||
|
GCAGGGCTCCCGCGCCAGTG |
||||||
|
CTGTCGATCTCCAGGAAGAT |
||||||
|
TTCTTTAGGCAGCATTCAGC |
||||||
|
CACGCGGTCAACTTGTAACC |
||||||
|
TGCCTTTTTGGTGTGGAACC |
||||||
|
GAAATCGATCGTATACCGCA |
||||||
|
AGGCAGCATTCAGCGCCCTG |
||||||
|
GCACGCGGTCAACTTGTAAC |
||||||
|
CAGGGCTCCCGCGCCAGTGC |
||||||
|
TTTAAGATGAGTGACATCAA |
||||||
|
CACGCGCATTGTTGTCGTCA |
||||||
|
GAGTTTGGCCGAACCAGGTC |
||||||
|
ACCGAAATCGTGCCCCATGT |
||||||
|
AGGTCCATTATTAACACGGG |
||||||
|
CAAGAACCCTCTCTCTACAG |
||||||
|
TTTGGTCACTTGCTGCTGGA |
||||||
|
CTTGCTGCTGGAAACGACTA |
||||||
|
ACACAATGCGGCTGTCGATC |
||||||
|
GTTACGCATTAACACGACGT |
||||||
|
AACTTGTAACCCGTCGGAAC |
||||||
|
AGCCATCAAAGGAGAGCCCC |
||||||
|
GCTGTCGATCTCCAGGAAGA |
||||||
|
CAAAGGGAGCTATAACAGCC |
||||||
|
TGGGAGGGCCAACCTGAATG |
||||||
|
GACTAGATAGATCCAATTGG |
||||||
|
TTGTGCCTTTTTGGTGTGGA |
||||||
|
GCGTTTAAGATGAGTGACAT |
||||||
|
CGGGGTGTACCTCGGGCTTT |
||||||
|
GCGCGCTCCCGCCTTTATCT |
||||||
|
ATGCGGCTGTCGATCTCCAG |
||||||
|
CGCAGAAGTTCGGCAAAGTT |
||||||
|
ATGACCAGGCTCAATCGGAG |
||||||
|
CATCAAAGGAGAGCCCCTTG |
||||||
|
ATAGAGATATTCGAGAGTCA |
||||||
|
TGCTCGCGGGGTGTACCTCG |
||||||
|
TTGGTACCAACGCAAGAGGT |
||||||
|
AGCATTCAGCGCCCTGGTAA |
||||||
|
AGGGCTCCCGCGCCAGTGCC |
||||||
|
AACTAGCTGCACTATCGGTT |
||||||
|
CAATCGGAGGAGTAGTTGGT |
||||||
|
GTTCGCGTTTAAGATGAGTG |
||||||
|
GTAGCTTCACAAATGGTCGA |
||||||
|
CATTGTTGTCGTCATGGTTA |
||||||
|
GGGGTAAGGCCATTTGTGCC |
||||||
|
CGAAAAATGGTATCTCAAGA |
||||||
|
ATCAGCGGCCGGGACAATGG |
||||||
|
CTGTCCTTCCAGTGCTCGCG |
||||||
|
GGGAGGGCCAACCTGAATGT |
||||||
|
TCACGCGCATTGTTGTCGTC |
||||||
|
GATCGTATACCGCAGAAGTT |
||||||
|
AGCGCCCTGGTAAGTTACGC |
||||||
|
TAAGTTACGCATTAACACGA |
||||||
|
CCGAGCTATAGCAGCAAAAA |
||||||
|
TAGAGCGGGAACTGCTGCAG |
||||||
|
ACATCAAGGTCCCACACAGC |
||||||
|
ATCGATCGTATACCGCAGAA |
||||||
|
GTGGAACCCATTCTTCCCCT |
||||||
|
ATTAACACGACGTTCAGCGA |
||||||
|
CGGCGAAAAATGGTATCTCA |
||||||
|
TAACAGCCATCAAAGGAGAG |
||||||
|
TGCCCATCTCGGCGAAAAAT |
||||||
|
AATAGAGATATTCGAGAGTC |
||||||
|
AAGCGCGCTCCCGCCTTTAT |
||||||
|
CTTTAGGCAGCATTCAGCGC |
||||||
|
GCACTATCGGTTTGGTCACT |
||||||
|
GGCTTTAAGGTTAGGGGGTA |
||||||
|
GAGATATTCGAGAGTCACAT |
||||||
|
TCAGCGAAGAACGAATACGA |
||||||
|
AAATTTACGTCCGAGCTATA |
||||||
|
GTGGAAATCGATCGTATACC |
||||||
|
GCCCATTATAATCCACATGG |
||||||
|
TAAGGCCATTTGTGCCTTTT |
||||||
|
CTTTCTTTAGGCAGCATTCA |
||||||
|
CTGTATCGTAGCTTCACAAA |
||||||
|
TTGGTGTGGAACCCATTCTT |
||||||
|
GGAGCTATAACAGCCATCAA |
||||||
|
TTCGCGTTTAAGATGAGTGA |
||||||
|
AGGGGGTAAGGCCATTTGTG |
||||||
|
TGAATGTGGAAATCGATCGT |
||||||
|
ACGCATTAACACGACGTTCA |
||||||
|
ATTAACACGGGCTCATCTTT |
||||||
|
AGGAAGATGTTGGTGCGAAG |
||||||
|
GAGCGGGAACTGCTGCAGGG |
||||||
|
AATGGTATCTCAAGAACCCT |
||||||
|
CTACAGGCTAGTCATGCGGG |
||||||
|
CCAGGAGAGATTGAATAGAG |
||||||
|
CGAAGAAACCGAAATCGTGC |
||||||
|
GAACTGCTGCAGGGCTCCCG |
||||||
|
ACCAGGCTCAATCGGAGGAG |
||||||
|
TATCGGTTTGGTCACTTGCT |
||||||
|
TTTTGCCCATTATAATCCAC |
||||||
|
TTCAGCGAAGAACGAATACG |
||||||
|
GGCTCGCACGCGGTCAACTT |
||||||
|
TCACCATTCCCAGCAAATGC |
||||||
|
TGCGAAGAAACCGAAATCGT |
||||||
|
AGAGGTCCATTATTAACACG |
||||||
|
GGCGAAAAATGGTATCTCAA |
||||||
|
CACAATGCGGCTGTCGATCT |
||||||
|
TACCAACGCAAGAGGTCCAT |
||||||
|
TACACAATGCGGCTGTCGAT |
||||||
|
CACGAACACTAAACAAAGGT |
||||||
|
GAACACTAAACAAAGGTGGT |
||||||
|
CTCAATCGGAGGAGTAGTTG |
||||||
|
CGCATTGTTGTCGTCATGGT |
||||||
|
AGATATTCGAGAGTCACATA |
||||||
|
ACCTGAATGTCGGGTCCGAG |
||||||
|
TTACGAATGACCAGGCTCAA |
||||||
|
GCCCCATGTTCGCGTTTAAG |
||||||
|
AGCTTCACAAATGGTCGAAA |
||||||
|
GAACGGGAATCCTGGGAGGG |
||||||
|
CGGGAACTGCTGCAGGGCTC |
||||||
|
TGGTCGAAACTAGCTGCACT |
||||||
|
TCGATCTCCAGGAAGATGTT |
||||||
|
CCATTCTTCCCCTGGAAGCG |
||||||
|
AAGATGAGTGACATCAAGGT |
||||||
|
CATTCCCAGCAAATGCCTTT |
||||||
|
TCCACATGGCTATAGCAACA |
||||||
|
GCTCCCGCGCCAGTGCCCAT |
||||||
|
CGGCAAAGTTAGAGCGGGAA |
||||||
|
CTTTAAGGTTAGGGGGTAAG |
||||||
|
CCCATCTCGGCGAAAAATGG |
||||||
|
CCATTTGTGCCTTTTTGGTG |
||||||
|
GCTGTATCGTAGCTTCACAA |
||||||
|
CCTGGAAGCGCGCTCCCGCC |
||||||
|
AAACTAGCTGCACTATCGGT |
||||||
|
AGAACCCTCTCTCTACAGGC |
||||||
|
CGGGACAATGGCCCGTGTGA |
||||||
|
CATTATTAACACGGGCTCAT |
||||||
|
GCGCATTGTTGTCGTCATGG |
||||||
|
CTCGGCGAAAAATGGTATCT |
||||||
|
AACGCAAGAGGTCCATTATT |
||||||
|
ACGACGTTCAGCGAAGAACG |
||||||
|
GCTGGAAACGACTAGATAGA |
||||||
|
ACATGGCTATAGCAACAATA |
||||||
|
CGGGAATCCTGGGAGGGCCA |
||||||
|
CAAATGCCTTTCTTTAGGCA |
||||||
|
TCGTCATGGTTACAAAATTT |
||||||
|
TAGGGGGTAAGGCCATTTGT |
||||||
|
GCAGCAAAAATTTTGGACCG |
||||||
|
TTTGGTGTGGAACCCATTCT |
||||||
|
ACGTCCGAGCTATAGCAGCA |
||||||
|
ATGTCGGGTCCGAGTTTGGC |
||||||
|
CTAGATAGATCCAATTGGCC |
||||||
|
GTTGGTGCGAAGAAACCGAA |
||||||
|
CGCAAGAGGTCCATTATTAA |
||||||
|
GGCTATAGCAACAATAATTA |
||||||
|
AAGAAACCGAAATCGTGCCC |
||||||
|
GCCAGTGCCCATCTCGGCGA |
||||||
|
AAACCGAAATCGTGCCCCAT |
||||||
|
GGGCTCATCTTTTTGCCCAT |
||||||
|
GCCGAACCAGGTCAAAGGGA |
||||||
|
TTATCTCGTGCATCCAGGAG |
||||||
|
AAGGCCATTTGTGCCTTTTT |
||||||
|
TGTGATCAGCGGCCGGGACA |
||||||
|
GTCACATACACAATGCGGCT |
||||||
|
GGCAAAGTTAGAGCGGGAAC |
||||||
|
CGCCAGTGCCCATCTCGGCG |
||||||
|
ACGCGGTCAACTTGTAACCC |
||||||
|
GCTCATCTTTTTGCCCATTA |
||||||
|
CGAACCAGGTCAAAGGGAGC |
||||||
|
TATCGTAGCTTCACAAATGG |
||||||
|
GAGTCACATACACAATGCGG |
||||||
|
AGAGCCCCTTGCGCTGTATC |
||||||
|
ATCAAGGTCCCACACAGCGT |
||||||
|
TAGTCATGCGGGCTCGCACG |
||||||
|
GTTAGGGGGTAAGGCCATTT |
||||||
|
CAATGGCCCGTGTGAATGTG |
||||||
|
GCGAGGCCCTTGACACGAAC |
||||||
|
TCAAAGGGAGCTATAACAGC |
||||||
|
CAAGAGGTCCATTATTAACA |
||||||
|
TCTCCAGGAAGATGTTGGTG |
||||||
|
TCCCGCGCCAGTGCCCATCT |
||||||
|
AATCACGCGCATTGTTGTCG |
||||||
|
ATCTCGGCGAAAAATGGTAT |
||||||
|
AATGTCGGGTCCGAGTTTGG |
||||||
|
TCAAGGTCCCACACAGCGTG |
||||||
|
GGTACCAACGCAAGAGGTCC |
||||||
|
CGTCATGGTTACAAAATTTA |
||||||
|
GACAGAGCGAGGCCCTTGAC |
||||||
|
GGAAGATGTTGGTGCGAAGA |
||||||
|
AACACGGGCTCATCTTTTTG |
||||||
|
CTATAGCAGCAAAAATTTTG |
||||||
|
ATGGTATCTCAAGAACCCTC |
||||||
|
TGTGGAAATCGATCGTATAC |
||||||
|
TGCCTTTCTTTAGGCAGCAT |
||||||
|
TCGGTTTGGTCACTTGCTGC |
||||||
|
TGGTACCAACGCAAGAGGTC |
||||||
|
CGAATGACCAGGCTCAATCG |
||||||
|
TTGGACCGCTGTCCTTCCAG |
||||||
|
TGTTGTCGTCATGGTTACAA |
||||||
|
TAGGCAGCATTCAGCGCCCT |
||||||
|
GCCCATCTCGGCGAAAAATG |
||||||
|
ACAATAATTACGTCACCATT |
||||||
|
TACAAAATTTACGTCCGAGC |
||||||
|
AATTGGCCGAAATCACGCGC |
||||||
|
TTTGGACCGCTGTCCTTCCA |
||||||
|
GCGGGGTGTACCTCGGGCTT |
||||||
|
GCAGAAGTTCGGCAAAGTTA |
||||||
|
CCCGCCTTTATCTCGTGCAT |
||||||
|
CAAAATTTACGTCCGAGCTA |
||||||
|
CCATCTCGGCGAAAAATGGT |
||||||
|
TGAATGTCGGGTCCGAGTTT |
||||||
|
TCCTGGGAGGGCCAACCTGA |
||||||
|
TCCGAGTTTGGCCGAACCAG |
||||||
|
AGGCTAGTCATGCGGGCTCG |
||||||
|
TATTAACACGGGCTCATCTT |
||||||
|
GCTGCACTATCGGTTTGGTC |
||||||
|
TATAGCAACAATAATTACGT |
||||||
|
GTTACAAAATTTACGTCCGA |
||||||
|
GGGTCCGAGTTTGGCCGAAC |
||||||
|
GCCTTTTTGGTGTGGAACCC |
||||||
|
AGGAGAGATTGAATAGAGAT |
||||||
|
TGGTGTGGAACCCATTCTTC |
||||||
|
ACAATGGCCCGTGTGAATGT |
||||||
|
TAACCCGTCGGAACGGGAAT |
||||||
|
AGTTACGCATTAACACGACG |
||||||
|
CAGGAGAGATTGAATAGAGA |
||||||
|
GCGCTGTATCGTAGCTTCAC |
||||||
|
GCTCGCACGCGGTCAACTTG |
||||||
|
GGTGCGAAGAAACCGAAATC |
||||||
|
ATGCGGGCTCGCACGCGGTC |
||||||
|
GGTGTACCTCGGGCTTTAAG |
||||||
|
TACCGCAGAAGTTCGGCAAA |
||||||
|
GTGCCCCATGTTCGCGTTTA |
||||||
|
GTAAGGCCATTTGTGCCTTT |
||||||
|
GCCCTTGACACGAACACTAA |
||||||
|
GAACCAGGTCAAAGGGAGCT |
||||||
|
TTACGTCCGAGCTATAGCAG |
||||||
|
ATGAGTGACATCAAGGTCCC |
||||||
|
ACCAACGCAAGAGGTCCATT |
||||||
|
ATCAAAGGAGAGCCCCTTGC |
||||||
|
ATTACGTCACCATTCCCAGC |
||||||
|
TCACAAATGGTCGAAACTAG |
||||||
|
ACATACACAATGCGGCTGTC |
||||||
|
ACCATTCCCAGCAAATGCCT |
||||||
|
AGCTGCACTATCGGTTTGGT |
||||||
|
CAGCAAAAATTTTGGACCGC |
||||||
|
GTGGTTGTGATCAGCGGCCG |
||||||
|
TGGAACCCATTCTTCCCCTG |
||||||
|
AGAGTCACATACACAATGCG |
||||||
|
TAGCTGCACTATCGGTTTGG |
||||||
|
CTAAACAAAGGTGGTTGTGA |
||||||
|
AGTTAGAGCGGGAACTGCTG |
||||||
|
GTTGTGATCAGCGGCCGGGA |
||||||
|
CGTCGGAACGGGAATCCTGG |
||||||
|
CACCATTCCCAGCAAATGCC |
||||||
|
CCTTTTTGGTGTGGAACCCA |
||||||
|
AAATTTTGGACCGCTGTCCT |
||||||
|
CATGGCTATAGCAACAATAA |
||||||
|
TCGAAACTAGCTGCACTATC |
||||||
|
AAAGGGAGCTATAACAGCCA |
||||||
|
ATATTCGAGAGTCACATACA |
||||||
|
AAAAATTTTGGACCGCTGTC |
||||||
|
TTAACACGACGTTCAGCGAA |
||||||
|
TCGTATACCGCAGAAGTTCG |
||||||
|
CGAACACTAAACAAAGGTGG |
||||||
|
AGCAGCAAAAATTTTGGACC |
||||||
|
AGTTCGGCAAAGTTAGAGCG |
||||||
|
TTCCCAGCAAATGCCTTTCT |
||||||
|
ATTCGAGAGTCACATACACA |
||||||
|
GTTGTCGTCATGGTTACAAA |
||||||
|
GGGGGTAAGGCCATTTGTGC |
||||||
|
TTCGGCAAAGTTAGAGCGGG |
||||||
|
AACAAAGGTGGTTGTGATCA |
||||||
|
TGCTGCAGGGCTCCCGCGCC |
||||||
|
TGGTAAGTTACGCATTAACA |
||||||
|
CTTGACACGAACACTAAACA |
||||||
|
CCATCAAAGGAGAGCCCCTT |
||||||
|
AGCGAAGAACGAATACGACA |
||||||
|
TTGGCCGAACCAGGTCAAAG |
||||||
|
TTTATCTCGTGCATCCAGGA |
||||||
|
CTCGGGCTTTAAGGTTAGGG |
||||||
|
CGTAGCTTCACAAATGGTCG |
||||||
|
TTACGTCACCATTCCCAGCA |
||||||
|
GGAAACGACTAGATAGATCC |
||||||
|
CGAGGCCCTTGACACGAACA |
||||||
|
ACCTCGGGCTTTAAGGTTAG |
||||||
|
TCCCACACAGCGTGCCATTC |
||||||
|
TTTTTGCCCATTATAATCCA |
||||||
|
AGCAAATGCCTTTCTTTAGG |
||||||
|
CGATCGTATACCGCAGAAGT |
@ -0,0 +1,49 @@ |
|||||||
|
import jinja2 |
||||||
|
import os |
||||||
|
|
||||||
|
def main(): |
||||||
|
|
||||||
|
# Jinja env |
||||||
|
env = jinja2.Environment(loader=jinja2.FileSystemLoader('.')) |
||||||
|
|
||||||
|
problems = [ |
||||||
|
{ |
||||||
|
'chapter': '3', |
||||||
|
'problem': 'a', |
||||||
|
'title': 'Generate k-mer Composition of a String', |
||||||
|
'description': 'Given an input string, generate a list of all kmers that are in the input string.', |
||||||
|
'url': 'http://rosalind.info/problems/ba3a/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '3', |
||||||
|
'problem': 'b', |
||||||
|
'title': 'Reconstruct string from genome path', |
||||||
|
'description': 'Reconstruct a string from its genome path, i.e., sequential fragments of overlapping DNA.', |
||||||
|
'url': 'http://rosalind.info/problems/ba3b/' |
||||||
|
}, |
||||||
|
{ |
||||||
|
'chapter': '3', |
||||||
|
'problem': 'c', |
||||||
|
'title': 'Construct the overlap graph of a set of k-mers', |
||||||
|
'description': 'Given a set of overlapping k-mers, construct the overlap graph and print a sorted adjacency matrix', |
||||||
|
'url': 'http://rosalind.info/problems/ba3c/' |
||||||
|
}, |
||||||
|
] |
||||||
|
|
||||||
|
print("Writing problem boilerplate code") |
||||||
|
|
||||||
|
t = 'template.go.j2' |
||||||
|
for problem in problems: |
||||||
|
contents = env.get_template(t).render(**problem) |
||||||
|
fname = 'ba'+problem['chapter']+problem['problem']+'.go' |
||||||
|
if not os.path.exists(fname): |
||||||
|
print("Writing to file %s..."%(fname)) |
||||||
|
with open(fname,'w') as f: |
||||||
|
f.write(contents) |
||||||
|
else: |
||||||
|
print("File %s already exists, skipping..."%(fname)) |
||||||
|
|
||||||
|
print("Done") |
||||||
|
|
||||||
|
if __name__=="__main__": |
||||||
|
main() |
@ -0,0 +1,49 @@ |
|||||||
|
package rosalindchapter{{chapter}} |
||||||
|
|
||||||
|
import ( |
||||||
|
"fmt" |
||||||
|
"log" |
||||||
|
|
||||||
|
rosa "github.com/charlesreid1/go-rosalind/rosalind" |
||||||
|
) |
||||||
|
|
||||||
|
// Print problem description for Rosalind.info |
||||||
|
// Problem BA{{chapter}}{{problem}}: {{title}} |
||||||
|
func BA{{chapter}}{{problem}}Description() { |
||||||
|
description := []string{ |
||||||
|
"-----------------------------------------", |
||||||
|
"Rosalind: Problem BA{{chapter}}{{problem}}:", |
||||||
|
"{{title}}", |
||||||
|
"", |
||||||
|
"{{description}}", |
||||||
|
"", |
||||||
|
"URL: {{url}}", |
||||||
|
"", |
||||||
|
} |
||||||
|
for _, line := range description { |
||||||
|
fmt.Println(line) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Run the problem |
||||||
|
func BA{{chapter}}{{problem}}(filename string) { |
||||||
|
|
||||||
|
BA{{chapter}}{{problem}}Description() |
||||||
|
|
||||||
|
// Read the contents of the input file |
||||||
|
// into a single string |
||||||
|
lines, err := rosa.ReadLines(filename) |
||||||
|
if err != nil { |
||||||
|
log.Fatalf("rosa.ReadLines: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
//// Input file contents |
||||||
|
//input := lines[0] |
||||||
|
//params := lines[1] |
||||||
|
//result := rosa.PatternCount(input, pattern) |
||||||
|
// |
||||||
|
//fmt.Println("") |
||||||
|
//fmt.Printf("Computed result from input file: %s\n", filename) |
||||||
|
//fmt.Println(result) |
||||||
|
} |
||||||
|
|
@ -0,0 +1,4 @@ |
|||||||
|
# rosalind go package |
||||||
|
|
||||||
|
This directory contains the `rosalind` Go package. |
||||||
|
|
@ -0,0 +1,5 @@ |
|||||||
|
Input: |
||||||
|
CACAGTAGGCGCCGGCACACACAGCCCCGGGCCCCGGGCCGCCCCGGGCCGGCGGCCGCCGGCGCCGGCACACCGGCACAGCCGTACCGGCACAGTAGTACCGGCCGGCCGGCACACCGGCACACCGGGTACACACCGGGGCGCACACACAGGCGGGCGCCGGGCCCCGGGCCGTACCGGGCCGCCGGCGGCCCACAGGCGCCGGCACAGTACCGGCACACACAGTAGCCCACACACAGGCGGGCGGTAGCCGGCGCACACACACACAGTAGGCGCACAGCCGCCCACACACACCGGCCGGCCGGCACAGGCGGGCGGGCGCACACACACCGGCACAGTAGTAGGCGGCCGGCGCACAGCC |
||||||
|
10 2 |
||||||
|
Output: |
||||||
|
GCACACAGAC GCGCACACAC |
@ -0,0 +1,5 @@ |
|||||||
|
Input |
||||||
|
CTTGCCGGCGCCGATTATACGATCGCGGCCGCTTGCCTTCTTTATAATGCATCGGCGCCGCGATCTTGCTATATACGTACGCTTCGCTTGCATCTTGCGCGCATTACGTACTTATCGATTACTTATCTTCGATGCCGGCCGGCATATGCCGCTTTAGCATCGATCGATCGTACTTTACGCGTATAGCCGCTTCGCTTGCCGTACGCGATGCTAGCATATGCTAGCGCTAATTACTTAT |
||||||
|
9 3 |
||||||
|
Output |
||||||
|
AGCGCCGCT AGCGGCGCT |
File diff suppressed because one or more lines are too long
@ -0,0 +1,10 @@ |
|||||||
|
Input |
||||||
|
5 2 |
||||||
|
TCTGAGCTTGCGTTATTTTTAGACC |
||||||
|
GTTTGACGGGAACCCGACGCCTATA |
||||||
|
TTTTAGATTTCCTCAGTCCACTATA |
||||||
|
CTTACAATTTCGTTATTTATCTAAT |
||||||
|
CAGTAGGAATAGCCACTTTGTTGTA |
||||||
|
AAATCCATTAAGGAAAGACGACCGT |
||||||
|
Output |
||||||
|
AAACT AAATC AACAC AACAT AACCT AACTA AACTC AACTG AACTT AAGAA AAGCT AAGGT AAGTC AATAC AATAT AATCC AATCT AATGC AATTC AATTG ACAAC ACACA ACACC ACACG ACACT ACAGA ACAGC ACATC ACATG ACCAT ACCCT ACCGT ACCTA ACCTC ACCTG ACCTT ACGAC ACGAG ACGAT ACGCT ACGGT ACGTC ACGTT ACTAA ACTAG ACTAT ACTCA ACTCC ACTCG ACTCT ACTGA ACTGC ACTGT ACTTA ACTTC ACTTT AGAAA AGAAC AGAAG AGAAT AGACA AGACT AGATA AGATC AGCAT AGCCA AGCGT AGCTA AGCTC AGCTG AGCTT AGGAT AGGTA AGGTC AGTAA AGTAC AGTAT AGTCC AGTCG AGTCT AGTGA AGTTG ATAAA ATAAC ATACA ATACC ATAGA ATATA ATATC ATATG ATATT ATCAG ATCCC ATCCG ATCCT ATCGA ATCGC ATCTA ATCTC ATCTG ATGAC ATGAT ATGCA ATGCC ATGGA ATGGC ATGTA ATGTC ATTAA ATTAC ATTAG ATTAT ATTCA ATTCC ATTCG ATTGA ATTGC ATTGG ATTGT ATTTA ATTTC ATTTG ATTTT CAAAG CAACC CAACT CAAGA CAAGC CAATA CAATT CACAC CACAG CACCT CACGT CACTA CACTT CAGAA CAGAC CAGAT CAGGT CAGTA CAGTC CATAA CATAC CATAG CATAT CATCC CATCT CATGA CATGT CATTA CATTG CATTT CCAAG CCATA CCATG CCATT CCCGT CCCTA CCCTT CCGAA CCGAC CCGAT CCGCT CCGGT CCGTA CCGTC CCGTG CCGTT CCTAC CCTAT CCTCA CCTCC CCTTA CCTTC CCTTG CCTTT CGAAA CGAAG CGACA CGACT CGAGT CGATA CGATG CGATT CGCAA CGCAT CGCCA CGCGA CGCTA CGCTC CGCTT CGGAC CGGAT CGGCA CGGTA CGGTC CGGTT CGTAA CGTAC CGTCA CGTCG CGTCT CGTTA CGTTT CTAAC CTAAG CTAAT CTACA CTACC CTACG CTACT CTAGA CTAGC CTAGG CTAGT CTATA CTATC CTATG CTATT CTCAT CTCCG CTCGT CTCTA CTCTT CTGAA CTGAG CTGCA CTGCC CTGTA CTGTT CTTAA CTTAC CTTAG CTTAT CTTCA CTTGA CTTTA CTTTC CTTTG CTTTT GAAAT GAACA GAACT GAAGT GAATG GAATT GACAC GACAT GACCA GACCT GACGT GACTT GAGAA GAGAT GAGCT GATAA GATAC GATAG GATAT GATCA GATCC GATCG GATCT GATGT GATTA GATTC GATTG GATTT GCAAT GCACT GCATC GCATT GCCAT GCCGT GCCTA GCCTT GCGAT GCGGT GCGTC GCGTT GCTAA GCTAC GCTAG GCTAT GCTGA GCTGT GCTTA GCTTT GGAAT GGACA GGATA GGATC GGATT GGCTA GGGAT GGTAC GGTAG GGTAT GGTCA GGTCG GGTTA GTAAA GTAAG GTACA GTACC GTACG GTAGA GTATA GTATC GTATG GTATT GTCAA GTCAG GTCCG GTCCT GTCGA GTCGC GTCGT GTCTA GTCTG GTGAA GTGAG GTGCA GTGCG GTTAA GTTAC GTTAG GTTAT GTTCA GTTCC GTTCG GTTGA GTTTA TAAAC TAAAG TAACA TAACC TAACT TAAGA TAAGC TAATA TAATC TACAC TACAG TACCC TACCG TACCT TACGA TACGC TACGT TACTA TACTC TACTG TAGAA TAGAC TAGAG TAGAT TAGCC TAGCG TAGGA TAGTC TATAA TATAC TATAT TATCA TATCC TATCG TATGA TATGC TATGG TATGT TATTA TATTG TCAAC TCAAT TCACC TCACG TCACT TCAGA TCATA TCATG TCCAA TCCAC TCCAG TCCAT TCCCA TCCCT TCCGA TCCGC TCCGT TCCTA TCCTG TCCTT TCGAA TCGAC TCGAT TCGCC TCGCT TCGGA TCGGC TCGGG TCGGT TCGTC TCTAC TCTAG TCTAT TCTCC TCTCT TCTGG TCTGT TCTTA TCTTT TGAAA TGAAC TGAAT TGACA TGACC TGACT TGAGA TGAGC TGAGT TGATA TGATC TGATG TGATT TGCAA TGCAC TGCAG TGCAT TGCCA TGCCG TGCCT TGCGA TGCGT TGCTT TGGAA TGGAT TGGTA TGTAA TGTAG TGTAT TGTCC TGTCG TGTGG TGTTA TTAAA TTAAC TTAAG TTAAT TTACA TTACC TTACG TTACT TTAGA TTAGC TTAGG TTAGT TTATA TTATC TTATG TTATT TTCAA TTCAC TTCAT TTCCA TTCCC TTCCT TTCGA TTCGG TTCGT TTCTA TTCTG TTGAA TTGAC TTGAG TTGAT TTGCA TTGCG TTGGA TTGGG TTGTG TTTAA TTTAC TTTAG TTTAT TTTCA TTTCC TTTCG TTTGA TTTGG TTTTA TTTTG |
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,4 @@ |
|||||||
|
Input |
||||||
|
CTTCTCACGTACAACAAAATC |
||||||
|
Output |
||||||
|
2161555804173 |
File diff suppressed because one or more lines are too long
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in new issue