innovation in metadata design, implementation & best practices
Corpus is a search engine written in Java that index content, context and metadata in documents on the network or on the local file system. It is used as a "distributed relational database" for document-oriented solutions with metadata as indexed keys.
It also features:
- Rich result overviews, structured by document context or by clustering
- XML and RSS result output from built in HTTP server east to integrate with XSL transforms
- Parser for multiple formats including HTML, XML-formats, PDF, DOC etc.
- Auto generate metadata agent that uses LDAP and URL mappings
- Direct indexing using WebDAV log or file system auditing events.
- Web and command line administration
- API's for new agents, spiders and parsers
Monday, February 11, 2002
Jonas Bosson, IllumiNet AB, Stockholm, Sweden
+46-(0)8 666 96 61 (CET)