Beyond keyword search
GoPubMed® allows users to find information significantly faster and guarantees completeness of search results. The fundamental difference between GoPubMed’s® semantic search technology and traditional search engines such as PubMed or Google is the use of background knowledge. Semantic algorithms connect text – abstracts from the MEDLINE database – to background knowledge in the form of semantic networks of concept categories, also called ontologies or knowledge base. This is done by meaning and not by keywords only. So results are meaningfully structured and intelligent semantic navigation becomes possible. The concept categories come from the Gene Ontology (GO), the Medical Subject Headings (MeSH), the Universal Protein Resource (UniProt), Authors, Locations, Journals, and Publication Dates. In GoPubMed® the user does the ranking. Examples below clearly demonstrate this semantic power.
The background knowledge
The knowledge behind GoPubMed® consists of in total 48 million concepts – the majority of 47 million come from authors, 45,000 from the GO and MeSH ontologies (plus about 150,000 synonyms), 62,000 from proteins, 23,000 from journals, 13,000 from geolocations, and 35,000 from time-related concepts.
GO describes gene and gene products in different organisms. There are three top-level concepts (also called classes or categories):
- The molecular function of gene products
- Their role in multi-step biological processes
- Their localization to cellular components
GoPubMed®uses 12 of 16 top-level concepts of MeSH:
- Biological Sciences
- Chemicals and Drugs
- Health Care
- Named Groups
- Natural Sciences
- Psychiatry and Psychology
- Techniques and Equipment
- Technology, Industry, Agriculture
More specific concepts are found on deeper levels in the tree of the concept categories. Sorting abstracts that way enables the combined search in molecular biology and medicine as well as in metadata. The categories help to systematically explore the result set and lead to faster and more complete search results.
Try this first: Search for “All” and retrieve more than 23 million PubMed articles. Now explore the tree on the left-hand side and refine your search semantically! Click on concepts and browse, filter, and exclude or add the concept to your favorites.
By choosing the “Browse” option the concept is not selected sticky, which means the next browse-click on another concept will remove the previous browse selection. This is helpful in cases where you already have used the “With” option and want to check some other related concepts. The “Without” option excludes documents from the search results containing this concept, including its sub-concepts! The “Add to favorites” option adds this concept to your favorites (you must be registered and logged in). This allows for easy customized search.
The left-hand refine section represents the background knowledge network as a tree and serves as a semantic table of contents. It structures the articles on the right-hand side for a specific query. It also gives a statistical overview of the search result set. Behind each concept the number of articles found containing this concept (and its sub-concepts) is shown. By selecting concepts in the tree you semantically narrow down from thousands of search results to a few in seconds. Your benefits are tremendous timesaving and completeness. Why timesaving? Why completeness? Let’s have a look at one example: Let’s say we are searching for heart diseases. By performing a simple search for this term, you get around 100,000 results. If, on the other hand, we search for “heart diseases”[mesh], GoPubMed® utilizes the knowledge of the MeSH ontology to locate more documents related to the concept we are investigating. While in the first case we get only documents containing matches to the text heart diseases, in the second the results contain not only the documents that match exactly the words we look for but also the synonym concepts as defined by the MeSH ontology. Performing the search in such a way gives you around one million results, which you can then refine by defining the concepts of your interest. This feature gives you the complete results without the need to perform multiple searches for the same term expressed diferently. We eliminate the hassle of combining fragments of results we would have to do without this feature. Now let’s have a closer look at the concept categories.