Un peu de veille en sciences de l'information et de la documentation
| par Fabrizio Tinti |







Billets_récents

______________________


______________________

Ma_bib
Site web
Ressources SHS
BSPO@SlideShare
BSPO@LinkedIn

______________________

Coin_perso
Sur la liseuse (1)
Sur la liseuse (2)
Sur la platine
The Eternal (Sonic Youth)
The Dead Weather [vidéo]
Fresh Blood (Eels)
For What It's Worth (Placebo)
Dark Night Of The Soul
Die Slow (Health)


« Bibliothèque virtuelle et bibliothèque physique: indissociables et complémentaires | Page d'accueil | Pratiques documentaires dans la recherche et l’enseignement universitaire »

mercredi, 15 juillet 2009

D-Lib Magazine (août 09)

Au sommaire, notamment, du dernier n° de D-Lib Magazine (vol. 15, n° 7-8, juillet-août 09):

Articles:

"This article will discuss how to measure the accuracy of Optical Character Recognition (OCR) output in a way that is relevant to the needs of the end users of digital resources. A case study measuring the OCR accuracy of the British Library's 19th Century Newspapers Database provides a clear example of the benefits to be gained from measuring not just character accuracy but also word and significant word accuracy. As OCR primarily facilitates searching, indexing and other means of structuring the user experience of online newspaper archives, measuring the word and significant word accuracy of the OCR output is very revealing of a resource's likely performance for these functions. Having such data is therefore extremely helpful for planning and quality assurance assessment. After briefly discussing the role of OCR in the text capture process and how OCR works, we give a detailed description of the methodology, statistical data gathering techniques and analysis used in this study. Our conclusions point the way forward with suggested actions to assist other mass digitization projects in applying these techniques."

"This article is motivated by the demand for unified access to the wealth of distributed digital cultural collections, allowing users to make queries and discover information about them through integrated processes. Our effort originates from the semantic interoperability perspective and considers CIDOC/CRM as the mediating schema, which integrates in an optimal way the semantics of the collection-level metadata schemas and application profiles. The research reveals the complexity of mapping metadata schemas to ontologies and resolves particular difficulties by presenting the crosswalk between Dublin Core Collections Application Profile and CIDOC/CRM."

Comptes rendus de conférence:

  • Doing So Much More: The Fourth Annual International Conference on Open Repositories (OR09)

Trackbacks

Voici l'URL pour faire un trackback sur cette note : http://pintini.blogspirit.com/trackback/1795637