List of all members.
Public Member Functions |
| gkArrays (char *tags_file, uint threshold, bool use_bitvector=false, uint tag_length=0, bool stranded=false, uint nb_threads=1) |
| gkArrays (char *tags_file1, char *tags_file2, uint threshold, bool use_bitvector=false, uint tag_length=0, bool stranded=false, uint nb_threads=1) |
uintSA | convertPposToQpos (uintSA i) |
uintSA | getEndPosOfTagNum (uint tag_num) |
uintSA | getGkCFA (uintSA i) |
uintSA | getGkCFALength () |
uintSA | getGkISA (uintSA i) |
uintSA | getGkSA (uintSA i) |
uintSA | getGkSALength () |
uintSA | getNbPposition (uintSA nb_reads) |
uint | getNbTags () |
uint | getNbTagsWithFactor (uint tag_num, uint pos_factor, bool multiplicity=0) |
uint | getNbThreads () |
uint | getPair (uint i) |
uintSA | getPosInCommon (uint tag_num, uint pos_factor) |
readsReader * | getReads () |
uintSA | getStartPosOfTagNum (uint tag_num) |
uintSA | getStartQPosOfTagNum (uint tag_num) |
uint * | getSupport (uint i) |
uint | getSupportLength (uint i=0) |
char * | getTag (uint i) |
uint | getTagLength (uint i=0) |
char * | getTagFactor (uint i, uint p, uint l) |
uint | getTagNum (uintSA pos) |
std::pair< uint, uint > | getTagNumAndPosFromAbsolutePos (uintSA pos) |
uint * | getTagNumWithFactor (uint tag_num, uint pos_factor) |
std::pair< uint, uint > * | getTagsWithFactor (uint tag_num, uint pos_factor) |
std::pair< uint, uint > * | getTagsWithFactor (char *factor, uint factor_length, uint &nb_fact) |
char * | getTextFactor (uintSA pos, uint length) |
uint | getThreshold () |
array_type | getType () |
bool | isLarge () |
bool | isPposition (uintSA pos) |
bool | isStranded () |
bool | isTheFirstMemberOfPair (uint i) |
Static Public Member Functions |
static bool | isDiscarded (uint actual_length, uint theoretical_length=0, uint k=0) |
Constructor & Destructor Documentation
gkarrays::gkArrays::gkArrays |
( |
char * |
tags_file, |
|
|
uint |
threshold, |
|
|
bool |
use_bitvector = false , |
|
|
uint |
tag_length = 0 , |
|
|
bool |
stranded = false , |
|
|
uint |
nb_threads = 1 |
|
) |
| |
Construct the read index
- Parameters:
-
tags_file | Name of the file containg the reads |
threshold | length of k-mers we have to use |
use_bitvector,: | true iff we must store the array using a bit vector (slower but more space efficient) |
tag_length | length of the reads. If a shorter read is found, it raises an error. If a longer read is found, only the prefix of tag_length characters is kept. If tag_length == 0 (default), just gess what the read length is. |
stranded,: | true iff we know which strand has been sequenced and, therefore, (for instance) AACG must not be considered as equal to its revcomp (CGTT). |
nb_threads | allows to build GkSA on a multi-thread architecture |
gkarrays::gkArrays::gkArrays |
( |
char * |
tags_file1, |
|
|
char * |
tags_file2, |
|
|
uint |
threshold, |
|
|
bool |
use_bitvector = false , |
|
|
uint |
tag_length = 0 , |
|
|
bool |
stranded = false , |
|
|
uint |
nb_threads = 1 |
|
) |
| |
Alternative to construct the read index with paired-end reads
- Parameters:
-
tags_file1 | Name of the file containing the reads of the first pair |
tags_file2 | Name of the file containing the reads of the second pair |
threshold | length of k-mers we have to use |
use_bitvector,: | true iff we must store the array using a bit vector (slower but more space efficient) |
tag_length | length of the reads. If a shorter read is found, it raises an error. If a longer read is found, only the prefix of tag_length characters is kept. If tag_length == 0 (default), just gess what the read length is. |
stranded,: | true iff we know which strand has been sequenced and, therefore, (for instance) AACG must not be considered as equal to its revcomp (CGTT). |
nb_threads | allows to build GkSA on a multi-thread architecture |
Member Function Documentation
Convert a position from P-position to Q-position (if you do not understand this, please read our article!). That converts a position as in the concatenation of reads to the position in GkIFA (for example). In the article, values of GkSA are also renumbered to Q-position but we do not renumber them in practice (it is quite useless).
- Parameters:
-
- Returns:
- a Q-position
Gives the end position of a given read in the concatenation of reads.
- Parameters:
-
- Returns:
- the end position of the read #tag_num in C_R (the concatenation of reads)
- Parameters:
-
i | the index position in the array (starting at 0). |
- Returns:
- the value of GkCFA at the given index ie. the number of k-factors of rank i, where i is the requested index.
- Returns:
- the number of elements in the GkCFA array. In other terms it corresponds to the number of distinct k-mers in the input.
- Parameters:
-
i | the index position in the array (starting at 0). |
- Returns:
- the value of GkISA at the given index ie. the rank of the k-factor at position P-position i.
- Parameters:
-
i | the index position in the array (starting at 0). |
- Returns:
- the value of GkSA at the given index ie. the P-position of the k-factor whose rank is i
- Returns:
- the number of entries in gkSA (ie. the number of P-positions)
- Returns:
- the number of P-positions in Cr from a number of reads (fixed length or not) This function is available before the construction of gkSA.
- Returns:
- the number of tags (or reads) indexed in the Gk Arrays
- Parameters:
-
tag_num | The number of the tag in the index |
pos_factor | Position of the factor in the tag |
multiplicity | Counts (if false) only once a tag that contains the factor many times |
- Returns:
- Return the number of tags sharing the factor starting at position pos_factor in the tag tag_num. This is the number of elements returned by the function getTagsWithFactor(.)
- Returns:
- the number of threads the GkArrays have been told to use. The threads can be used for the construction.
- Parameters:
-
i | The number of the tag in the index |
- Returns:
- the tag number of the paired-end read associated with i or -1 if reads are not paired-end.
- Returns:
- the rank of the P-k factor starting at position pos_factor in the read number tag_num.
Gives the start position of a given read in the concatenation of reads.
- Parameters:
-
- Returns:
- the start position of the read #tag_num in C_R (the concatenation of reads)
Gives the start Q-position of a given read in the ISA array
- Parameters:
-
- Returns:
- the start Q-position of the read #tag_num in GkISA.
- Parameters:
-
- Returns:
- an array whose length is getSupportLength(i) and where the value at position k is the number of occurrences of the k-factor starting at position k in the reads among all the Pk-factors.
Return the length of the support.
- Returns:
- getTagLength(i) - getThreshold()+1
- Parameters:
-
i | the read number to be retrieved |
- Returns:
- the read number i.
- Parameters:
-
i | The number of the tag in the index |
p | Position of the factor in the tag |
l | The length of the factor |
- Returns:
- the factor at the position p in the tag number i
- Parameters:
-
i | Tag number (if the length is not constant) |
- Returns:
- the length of the read.
Gives the number of a read
- Parameters:
-
pos | a position in SA or in the concatenated sequence of reads |
- Returns:
- the read number where this position lies
Return the number of tag and the relative position in that tag corresponding to a given position in the concatenation of reads
- Parameters:
-
pos | position in the concatenation of reads |
- Returns:
- a pair whose fist element is the tag number and the second element is the position in the tag.
- Parameters:
-
tag_num | The number of the tag in the index |
pos_factor | Position of the factor in the tag |
- Returns:
- Return an array that contains each tag number where the factors matches.
- Postcondition:
- The array is sorted
- Parameters:
-
tag_num | The number of the tag in the index |
pos_factor | Position of the factor in the tag |
- Returns:
- Return an array composed of pairs (tag, pos) corresponding to all the Pk-factors equal to the Pk-factor starting at position pos_factor in the tag tag_num.
- Postcondition:
- The array is sorted according to read number and read position
- Parameters:
-
factor | the pattern to be searched. |
factor_length | the length of the factor, should be <= getThreshold() |
nb_fact | nb_fact is used to give the number of occurrences in the array. |
- Returns:
- Return an array composed of pairs (tag, pos) corresponding to all the Pk-factors equal to the k-factor factor
- Parameters:
-
pos | The position from where we want to retrieve a text subtring. The position must be given in the original text (not the filtered one). |
length | the length of the substring to be retrieved. |
- Returns:
- text factor at position pos of length length. The returned string is NULL-terminated.
- Returns:
- return the length of the k-factors (ie. return k).
- Returns:
- the array type used for building GkSA and GkISA (either SMALL_ARRAY, LARGE_ARRAY or OPTIMAL_ARRAY).
- Returns:
- true iff the read is not suitable ie. if it is shorter than the specified length (if any) or shorter than the specified k-mer length.
- Returns:
- true if the nbPposition > 2^32
- Returns:
- true iff the position does not lie in the threshold - 1 last characters of a read, ie. if it is a P-position.
- Returns:
- true iff the GkArrays have been built as a strand-dependant index. Therefore a k-mer and its revcomp won't be considered as equal.
- Parameters:
-
i | the number of the tag in the index |
- Returns:
- true if the tag is the first member of is pair in case of paired-end files. False either
The documentation for this class was generated from the following files: