Search queries

Here you learn how to build your search queries. Please bear in mind that these queries are case sensitive, and this extends to all the syntax.

Search by string

Retrieve documents containing a specific string. Example: insulin

Search by document ID (docid)

Each document in your project has a unique document ID (docid). This search query retrieves the file matching your given docid. You can also use modifiers such as a wildcard.

Example, search a unique document by docid: docid:bXYmSmlclyO01lP1kOLKSA0cyPAT-letter.txt

Example using the wildcard: docid:*-letter.txt or docid:bXYmSmlclyO01lP1kOLKSA0cyPAT*.

Search by filename

Retrieve all files matching some filename, possibly with a wildcard.

Example, search all your pdf files: filename:*.pdf

Example, search a specific file with spaces: filename:"My filename has some spaces.md"

Search by document label

Find documents tagged with specific label and value.

Boolean example: label:isSevere:true

Enum example: label:severity:high

String example: label:name:Lois

Range example: label:number_issues:[10 TO 20]

Search by entity type

Retrieve all documents containing at least one entity that belongs to the given entity type. Example: entity:disease, retrieves all documents with at least one entity of the type disease.

If you add a term, e.g. entity:disease:cancer, you can find all documents containing at least one entity using that term.


Only by using the entity type id, you can also perform more advanced queries as:

  • count e.g. count_e_1:[2 TO *]): retrieve documents with at least 2 annotations of the type e_1.
  • norms_count_uniq e.g. norms_count_uniq_e_1:[2 TO *] retrieve documents with at least 2 annotations of the type e_1 that are normalized to different unique names (e.g Rezulin and Romozin - same diabetic drug sold under different commercial names - would be normalized to troglitazone, so it would count 1 unique entity normalized, not 2).

Search by normalization

Retrieve all documents containing at least one entity that normalizes to the given normalization. Example: entity:genes:HER2, retrieves all documents with at least one entity gene that normalizes to HER2.

Search by date

Retrieve all documents imported or updated in a given time frame.

created: documents imported in a given time frame. Examples: created:2018, created:2018-03, created:2018-03-06, created:[2013 TO NOW], created:[2016-12 TO 2017-02], created:[NOW-1DAY TO NOW] - documents imported since the previous day.

updated: documents updated in a given time frame. Examples: updated:2018, updated:2018-03, updated:2018-03-06, updated:[2013 TO NOW], updated:[2016-12 TO 2017-02], updated:[NOW-1DAY TO NOW] - documents updated since the previous day.

Search by folder

You have three possibilities to search by folder:

  1. Search by folder index (folder:INDEX): the folders' indexes (integer numbers) are written in the project settings JSON. Take note of the folder's index you want to search for, and then search like folder:INDEX. For example, to search for the pool documents (special folder, always created), search like: folder:0.
  2. Search by folder path (folder:PATH): for example, if the structure of your desired folder is pool > level1 > A, compose the folder path as in Unix: folder:pool/level1/A. Note that any leading or trailing /'s are discouraged, although accepted and ignored.
  3. Search by folder name (folder:NAME): following the previous example, you could simply search by folder:A. In case you have different folders with the same name, the folder closest to the root level (the pool), that is, the folder less deep in the folder tree, will be found. For instance, if you had the folders pool/level1/A and pool/level1/level2/A, the former folder will be found. Caveat: in case you have different folders with the same name at the same level of the folder tree, one will be arbitrarily chosen and returned.

Search confirmed documents

You can search which documents are confirmed with query: anncomplete:true.

You can search which documents are not confirmed with query: anncomplete:false.

Search which documents a user has confirmed

You can retrieve the documents a given member has confirmed with the query: members_anncomplete:username

You can also retrieve all the documents that have been confirmed by at least one member with the query: members_anncomplete:*

Create a query for a set of users following this example: members_anncomplete:user1 AND members_anncomplete:user2 AND members_anncomplete:user3

Search which documents a user was assigned to

You can retrieve the documents distributed to a given member, with the query: members_assigned:username

You can also retrieve all the documents that have at least one assignee, with the query: members_assigned:*

You can combine the query fields with boolean logic, for example to find all documents allocated to two given users: members_assigned:user-A AND members_assigned:user-C

Wildcard search

To perform a single character wildcard search use ?. Example: entity:gene:P?2649.

To perform a multiple character wildcard search use *. Example: "Kepler-2*", "Kepler-4*c".

Tip: find all documents by just searching for *.

filter:TODO

The special search filter:TODO lists the documents that the logged user still has to annotate or review, if any. See Task Distribution and Annotation Flows.

Note that you cannot search the TODO list for other users; the filter is only available for the currently logged in user.

Fuzzy search

Find similar terms (string based search) based on the Levenshtein Distance, or Edit Distance algorithm. Use ~ at the end of a single word term. Example: roam~ will also find terms as foam.

You can fine tune the similarity level by adding, at the end, a number between 0 (less similar) and 1 (more similar). Example: roam~0.8.

Proximity search

Finding words (string based search) within a specific distance away. Example: "diabetes insulin"~10, to search documents with the terms diabetes and insulin within 10 words of each other.

Boolean operators

Search queries can be combined using the operators AND, OR, NOT and -. Some examples:

  • entity:GGP AND entity:Mutation search for documents that contain GGP entities and Mutation entities.
  • "type 1 diabetes" OR insulin search for documents that contain "type 1 diabetes" or "insulin".
  • "type 1 diabetes" NOT insulin search for documents that contain "type 1 diabetes" but not "insulin". This operator cannot be used with just one term.
  • -entity:GGP search for documents that don't contain mentions of genes, i.e. GGP entities.

Escaping Special Characters

To escape these special characters use the \ before the character. For example to search for PD-L1 use the query: PD\-L1.