Tuesday, February 21, 2012

Bio-databases...Part 2



15.The ENTREZ documentation mentions “E-utilities”. A link on the ENTREZ side
leads to the documentation of E-utilities …. Please explain, what E-utilities are and
what they can be used for.

 The E-utilities translates a standard set of input parameters into the values necessary for various NCBI software components to search for and retrieve the requested data. The E-utilities are therefore the structured interface to the Entrez system, which currently includes 38 databases covering a variety of biomedical data, including nucleotide and protein sequences, gene records, three-dimensional molecular structures, and the biomedical literature.

16.What categories of biodatabases are integrated under SRS and which ones are not?

Plant databases, organelle databases, immunological databases, microarray data and other gene expression databases are NOT included.


17.How do you link query results in SRS and how do you perform
“facetted searches “ (Multiple searches) using SRS? What is the usage of search results from one query as the starting group for the next query?

You click the ‘Link’ option in the left side of your screen after you have selected the search result you are interested in. Then when you are directed back to the database page then select multiple databases.
Microarray:
MIAME: Minimal Annotation about a Micoarray Experiment
Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (e.g. diseased tissue versus healthy tissue) and that are labeled with two different fluorophores.
In single-channel microarrays or one-color microarrays, the arrays provide intensity data for each probe or probe set indicating a relative level of hybridization with the labeled target.
 In standard microarrays, the probes are synthesized and then attached via surface engineering to a solid surface by a covalent bond to a chemical matrix. OR  Other microarray platforms, such as Illumina, use microscopic beads, instead of the large solid support.



  1. In the description file for the TAXONOMY database, the usage of taxonomy entries in
other databases is mentioned. Which types of other databases refer to TAXONOMY
entries?

The taxonomy database of the International Sequence Database Collaboration contains the names of all organisms that are represented in the sequence databases with at least one nucleotide or protein sequence. (like EMBL, ENA Project, RafSeq Genome, etc.)

  1. What is a catalogue?

The database catalog of a database instance consists of metadata in which definitions of database objects such as base tables, views (virtual tables), synonyms, value ranges, indexes, users, and user groups are stored (Wikipedia)
In computing, a catalog is a directory of information about data sets, files, or a database. A catalog usually describes where a data set, file or database entity is located and may also include other information, such as the type of device on which each data set or file is stored.

  1. What is an index? How does SRS “index” over several databases?

A) An index is a feature of an entity that allows identifying and searching for elements of
the entity. Database indexes are auxiliary data structures that allow for quicker retrieval of data at the cost of slower writes and increased storage space. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

           B)SRS indexing process
 SRS is updated daily, it uses an update mechanism whereby external and local ftp sites are checked for new data files on a daily basis. In this way the system always provides the most up to date data that is available. The system can index plain text, html and xml formatted data files. These data files are broken down by a parser into entries and subsequently into fields. These field indices can then be used for data retrieval or for generating searchable links between different database entries. SRS indexes database records using a word by word approach. Queries can be broadened or refined by using any of the logical operators – and, or and but not.

  1. What is a hierarchy? Which relationship-type is used in hierarchies?
A hierarchy is an organization of entities, where each element (except the top one) has
one parent. Every child element has the features of the parent element.

  1. What is a taxonomy? Give a brief definition of a taxonomy!

A taxonomy is a collection of controlled vocabulary terms organized into a
hierarchical structure. Each term in a taxonomy is in one or more parent-child
relationships to other terms in the taxonomy. Taxonomy only has relations of type “is_a”.

  1. What is an ontology? What are the essential features of an ontology that distinguishes
it from a taxonomy?

Ontology is a controlled vocabulary expressed in an ontology representation language,
which has grammar for using vocabulary terms to express something meaningful
within a specified domain of interest. Ontology is organized as a DAG.


  1. Which of the above mentioned controlled vocabularies has a tree structure?

Taxonomy

  1. What is a directed acyclic graph (DAG) and which type of knowledge representation
is based on such a DAG?

DAG is a type of graph that has no cycles and all its edges are oriented in one
direction.
Ontology is based on DAG.

·         mitochondrion has two parents: it is an organelle and it is part of the cytoplasm;
·         organelle has two children: mitochondrion is an organelle, and organelle membrane is part of organelle

  1. Please explain / characterize the content of PubMed: how does a typical minimum
data set look like in PubMed? I refer to the „anatomy of search results page“
mentioned in the PubMed documentation.

Pubmed search results are displayed in a summary format, with the following anatomy of summary results.
1.      Title
2.      Abbreviated names of authors
3.      Abbreviated Journal title
4.      Publication Date
5.      Followed by Volume, Issue and Page numbers of the article.


  1. What is the difference between PubMed and MEDLINE? Explain in brief!

MEDLINE is a bibliographic database containing citations and abstracts of bioscience
articles. PubMed is a service under NCBI Entrez search and retrieval system. PubMed
provides access to bibliographic information that includes MEDLINE and some other
resources (PubMedCentral and articles from journals before MEDLINE-inclusion and
out-of-scope articles). It also provides links to free full-text articiles (if available).

  1. What is PubMedCentral? What does it contain and how does it differ from PubMed?

PubMed Central is a free digital database of full-text scientific literature in biomedical and life sciences. It is a free digital archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health (NIH), developed and managed by NIH's National Center for Biotechnology Information (NCBI) in the National Library of Medicine (NLM).  

No comments:

Post a Comment