DragonFly On-Line Manual Pages
LIBEXTRACTOR(3) DragonFly Library Functions Manual LIBEXTRACTOR(3)
NAME
libextractor - meta-information extraction library 1.0.0
SYNOPSIS
#include <extractor.h>
const char *EXTRACTOR_metatype_to_string (enum EXTRACTOR_MetaType
type);
const char *EXTRACTOR_metatype_to_description (enum EXTRACTOR_MetaType
type);
enum EXTRACTOR_MetaTypeEXTRACTOR_metatype_get_max (void);
struct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_defaults (enum
EXTRACTOR_Options flags);
struct EXTRACTOR_PluginList *EXTRACTOR_plugin_add (struct
EXTRACTOR_PluginList *prev, const char *library, const char *options,
enum EXTRACTOR_Options flags);
struct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_last (struct
EXTRACTOR_PluginList *prev, const char *library, const char *options,
enum EXTRACTOR_Options flags);
struct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_config (struct
EXTRACTOR_PluginList *prev, const char *config, enum EXTRACTOR_Options
flags); struct EXTRACTOR_PluginList *EXTRACTOR_plugin_remove
(struct EXTRACTOR_PluginList *prev, const char *library);
void EXTRACTOR_plugin_remove_all (struct EXTRACTOR_PluginList
*plugins);
void EXTRACTOR_extract (struct EXTRACTOR_PluginList *plugins, const
char *filename, const void *data, size_t size,
EXTRACTOR_MetaDataProcessor proc, void *proc_cls);
int EXTRACTOR_meta_data_prin t(void *handle, const char *plugin_name,
enum EXTRACTOR_MetaType type, enum EXTRACTOR_MetaFormat format, const
char *data_mime_type, const char *data, size_t data_len);
EXTRACTOR_VERSION
DESCRIPTION
GNU libextractor is a simple library for keyword extraction.
libextractor does not support all formats but supports a simple
plugging mechanism such that you can quickly add extractors for
additional formats, even without recompiling libextractor.
libextractor typically ships with dozens of plugins that can be used to
obtain meta data from common file-types. If you want to write your own
plugin for some filetype, all you need to do is write a little library
that implements a single method with this signature:
void EXTRACTOR_XXX_extract_method (struct EXTRACTOR_ExtractContext
*ec);
ec contains function pointers for reading, seeking, getting the overall
file size and returning meta data. There is also a field with options
for the plugin. New plugins will be automatically located and used
once they are installed in the respective directory (typically
something like /usr/lib/libextractor/).
The application extract gives an example how to use libextractor.
The basic use of libextractor is to load the plugins (for example with
EXTRACTOR_plugin_add_defaults), then to extract the keyword list using
EXTRACTOR_extract, and finally unloading the plugins (with
EXTRACTOR_plugin_remove_all).
Textual meta data obtained from libextractor is supposed to be UTF-8
encoded if the text encoding is known. Plugins are supposed to convert
meta-data to UTF-8 if necessary. The EXTRACTOR_meta_data_print
function converts the UTF-8 keywords to the character set from the
current locale before printing them.
SEE ALSO
extract(1)
LEGAL NOTICE
libextractor is released under the GPL and a GNU package
(http://www.gnu.org/).
BUGS
A couple of file-formats (on the order of 10^3) are not recognized...
AUTHORS
extract was originally written by Christian Grothoff
<christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use
<libextractor@gnu.org> to contact the current maintainer(s).
AVAILABILITY
You can obtain the original author's latest version from
http://www.gnu.org/software/libextractor/.
GNU libextractor 1.0.0 Sept 4, 2012 LIBEXTRACTOR(3)