Sign in
Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(18)
Automatic Annotation
Conditional Random Field
Data Access
Data Representation
Design and Implementation
Development Tool
Environmental Chemistry
Feature Selection
Indexation
Information Extraction
Information Gathering
Machine Learning
Mutual Information
Open Source
Search Engine
Sequence Mining
Support Vector Machine
Web Search Engine
Subscribe
Academic
Publications
Chem X Seer: A Web Search Engine and Repository for e-Chemistry
Chem X Seer: A Web Search Engine and Repository for e-Chemistry,C. Lee Giles,Prasenjit Mitra,Karl Mueller,James Z. Wang,Bingjun Sun,Levent Bolelli,Yin
Edit
Chem X Seer: A Web Search Engine and Repository for e-Chemistry
BibTex
|
RIS
|
RefWorks
Download
C. Lee Giles
,
Prasenjit Mitra
,
Karl Mueller
,
James Z. Wang
,
Bingjun Sun
,
Levent Bolelli
,
Ying Liu
,
Isaac Councill
,
William Brower
,
Qingzhao Tan
,
Anuj Jaiswal
,
James Kubicki
http://academic.research.microsoft.com/io.ashx?type=5&id=6326952&selfId1=0&selfId2=0&maxNumber=12&query=
Cyberinfrastructure or e-science has become crucial for scientific progress and
open source
systems have greatly facilitated design and implementation. In chemistry, the growth of data has been explosive and timely and effective information and
data access
is critical. We discuss our Chem X Seer (funded by NSF Chemistry) architecture, a portal and
search engine
for academic researchers in environmental chemistry, which integrates the scientific literature with experimental, analytical and simulation datasets. Chem X Seer consists of information crawled from the web, manual submission of scientific documents and user submitted datasets, as well as scientific documents and metadata provided by major publishers. Information gathered from the web is publicly accessible whereas access to restricted publisher resources will be provided by linking to their respective sites and users can control access to their data. Thus, instead of being a fully open
search engine
and repository, Chem X Seer will be a hybrid one, limiting access to some resources. Chem X Seer offers some unique aspects of search not yet present in other scientific search services or search engines. We have developed or are developing algorithms for the extraction of tables, figures, and chemical names and formulae from scientific documents enabling users to search on those fields. In particular Chem X Seer will provide the following search features: ∞ Full text search ∞ Author, affiliation, title and venue search ∞ Table search ∞ Figure search ∞ Chemical formulae and name search ∞ Citation and acknowledgement search ∞ Citation linking and statistics Chem X Seer takes advantage of many
open source
search and indexing tools such as Lucene and CiteSeer. For dataset search, we are developing tools that automatically annotate published data representations such as figures that permit researchers to annotate their datasets by providing both document-level and attribute- level metadata in OAI-PMH format. This level of data annotation permits more effective data search both at the attribute and semantic levels, and allows browsing of datasets and linking to existing scientific literature and other datasets in our and other repositories. Because Chem X Seer requires unique information extraction, several different
machine learning
methods, such as conditional random fields,
support vector
machines,
mutual information
based feature selection, sequence mining, are critical for performance. We give a progress report on Chem X Seer and draw lessons for other e-science and cyberinfrastructure systems in terms of design, implementation and research.
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
snowbird.djvuzone.org
)
References
(12)
Revolutionizing science and engineering through cyberinfrastructure: report of the national science foundation blue-ribbon panel on cyberinfrastructure
(
Citations: 190
)
D. Atkins
,
K. Droegemeier
,
S. Feldman
,
H. Garcia-molina
,
M. Klein
,
P. Messina
,
D. Messerschmitt
,
J. Ostriker
,
M. Wright
Published in 2003.
The United Kingdom Chemical Database Service
(
Citations: 109
)
David A. Fletcher
,
Robert F. Mcmeeking
,
Donald Parkin
Journal:
Journal of Chemical Information and Modeling - J CHEM INF MODEL
, vol. 36, no. 4, pp. 746-749, 1996
The semantic smart laboratory: A system for supporting the chemical escientist
(
Citations: 21
)
G. Hughes
,
H. Mills
,
D. D. Roure
,
J. G. Frey
,
L. Moreau
,
M. Schraefel
,
G. Smith
,
E. Zaluska
Published in 2004.
An architecture for creating collaborative semantically capable scientific data sharing infrastructures
(
Citations: 6
)
Anuj R. Jaiswal
,
C. Lee Giles
,
Prasenjit Mitra
,
James Ze Wang
Conference:
Web Information and Data Management - WIDM
, pp. 75-82, 2006
Automatic extraction of table metadata from digital documents
(
Citations: 12
)
Ying Liu
,
Prasenjit Mitra
,
C. Lee Giles
,
Kun Bai
Conference:
ACM/IEEE Joint Conference on Digital Libraries - JCDL
, pp. 339-340, 2006