• Keine Ergebnisse gefunden

Überblick über Herausforderungen und Lösungen bei der Primo-Implementierung im OBV

N/A
N/A
Protected

Academic year: 2022

Aktie "Überblick über Herausforderungen und Lösungen bei der Primo-Implementierung im OBV"

Copied!
31
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

DIE ÖSTERREICHISCHE BIBLIOTHEKENVERBUND UND SERVICE GMBH

Überblick über Herausforderungen und Lösungen bei der Primo-Implementierung im OBV

Victor Babitchev, Ulrike Krabo

Primo Entwicklertreffen, Dresden 08.11.2011

(2)

Inhalt

Primo im OBV

Datenaufbereitung von zentralen Ressourcen in heterogener Umgebung des OBV

Datenkorrekturen und Datenprüfungen

Überblick sonstiger Tools

Frontend Erweiterungen

• Layoutveränderungen

• PNX Enrichment

• Dynamische Einbindung weiterer Services (Wikipedia, Google Books)

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 2

(3)

Primo im OBV Teilnehmer - 8 (+4)

Einrichtung Start

Universität Innsbruck 2009 Herbst

Universität Wien 2010 März

Verbundsicht 2010 April

Veterinärmedizinische Universität Wien 2010 Juli

Universität Graz 2011 Feber

Österreichische Nationalbibliothek 2011 Mai Wirtschaftsuniversität Wien 2011 Juli

Universität für angewandte Kunst 2011 November

Technische Universität Wien 2012 Q. 1 ?

Universität Klagenfurt 2012 Q. 1-2 ?

Medizinische Universität Wien 2012 Q. 3 ?

Universität Salzburg 2012 Q. 4 ?

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 3

(4)

Zentrale Primo-Instanz Datenmodell / Datenfluss

HOL SE BIB01

Z30

ADM

BIB ACC01

HOL Z300

Publish zentral Aufberei

-tung (PPS)

eDOC

el.Objekte zur ACC01

Publish Lokal 1

SE

BIB02 Publish Lokal 2

UBW ACC UBI ...

Lokalsystem

Verbundsystem

PRIMO ML KB SFX Digit

Repos

Normalisie- rung

Enrich- ment

Laden Dedup FRBR

Plug-in-1 (Enrich.)

Harvesting

Lokale BIB- Felder nach HOL

Norm

Plug-in-2 (Index.)

Exporte für PRIMO

PNX

. - OBVSG

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 4

(5)

Datenaufbereitung von zentralen Ressourcen

in heterogener Umgebung des OBV

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 5

(6)

Challenges of heterogeneous data supply for Primo

Primo is flexible in processing and indexing data in heterogeneous environments

But it is Your job to prepare the data and what even more challenging, is maintaining the data consistent!

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 6

(7)

Challenges of heterogeneous data supply for Primo

A typical heterogeneous environment: Catalog„Catalog enrichment repository“

• Catalog is the master system for bibl. data (8,3 Mio. records), a part of it has linked objects that can be indexed (abstracts, TOCs, full texts)

• Repository is the master system for dig. objects (our “eDOC” maintains

~575.000 items)

• Each system has its own data management workflow

Catalog record

Repository object

0..1 record has dig. object 1..N

Repository object

Catalog record

1..N Objects linked to

1

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 7

(8)

Challenges of heterogeneous data supply for Primo

In order to implement in Primo full text indexing - data from the both sources should be supplied and maintained consistently!

Processing of bibliographic and linked „full text“ data in Primo (top to bottom flow)

Note. OBVSG uses the Primo BO „import pnx_extensions“ tool, another way could be using of Primo file splitters (run in pipes).

Catalog

record Primo Pipe

Repository object

Import pnx_extentions

pnx_record

pnx_extension TOC

pnx_extension

„full text“

Iindexing (full document: bibl.data + full text)

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 8

(9)

Challenges of heterogeneous data supply for Primo

The challenge

• Catalog and repository are master systems for the data types which they manage

• Primo is a „slave“ system consuming, transforming and linking data that it receives from the master systems

• The changes in each master system must be registered consistently and prepared for Primo considering its interrelationships. The latter is not trivial to achieve

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 9

(10)

Challenges of heterogeneous data supply for Primo

OBVSG approach – consistent data extraction for Primo

• Data changes in catalog prepares Ex Libris “Aleph Publishing Mechanism”

• Objects changes in eDOC are prepared by PPS, which also gets from eDOC plain texts from objects (*.pdf etc.)

 PPS prepares input for Primo „merging“ data considering its relations

Central Catalog

PPS

(Primo Proc.

System)

eDOC

pnx

time scale

Day 1 Day 2, t1

delta

delta MAB

XML

eDOC XML (plain texts)

Primo Pipe

Import pnx_extensions

pnx_ext pnx_ext

Primo Index-

ing

t2 t3

„delta“

t4

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 10

(11)

Datenkorrekturen und Datenprüfungen

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 11

(12)

Consistent data changes in heterogeneous environment

Sample of problem

Librarian received a request to remove access to a full text dissertation

• the text object is a link in the catalog record. Removal of the link (tag 655) is not a solution – but it happens …

The correct solution requires considering roles of each system component (“master” or “salve”) and its interconnections

… but it is hard to process such requests manually  we decided to automate it:

• protected eDOC 655 tags against manual changes (indicator „o‟)

• implemented a language “of requests” placed by librarian into the catalog‟s “memo record”

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 12

(13)

Implementation of consistent data changes

4. ACC01 module

ACC01 eDOC

3. eDOC module

5. Primo module

Primo

- “delete this object” … - “replace that

scan with a better one 1. acc01.Z104

REQUESTS are placed into standard “memo-records”

from Aleph GUI client

“Command requests” vs. direct changes in 655 links to eDOC

“Transaction ”

a Z104 request activates controlled changes in all

three systems Perl, Oracle, MySQL

Resources: 4 m/month Queue

2. Queue manager

Full texts eDOC 655/eDOC

Objects

request

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 13

(14)

Implementation of consistent data changes

Have all problems been solved so far?

Almost…. A good control over “the situation” is wanted:

 check and validate your data regularly

 prepare, or better „generate“ corrections where possible

To achieve this, we developed another tool – „ eDocXray4Primo”

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 14

(15)

Validation of data consistency

2. Aleph

eDocXray4Primo

1. Bib. Ids 3. eDOC

4. Primo

Error codes :

er-1 : NO SysNr was found for given ACNrs er-3 : XREF exists but NO edoc objects er-4 : eDOC object(s) exist but NO XREF found er-21 : NO objects in eDOC but pnx-ext. and XREF exist er-22 : NO pnx-ext. found for eDOC object(s)and XREF exists er-23 : Nr. of eDOC objects greater than nr. of pnx-ext. (XREF exists) er-24 : Nr. of eDOC objects less than nr. of pnx-ext. (XREF exists) er-25 : ACC DS: NO PNX record exists in Primo (for ACC DS only) er-26 : non-ACC: NO PNX record exists in Primo (XREF for ACC DS

exists!)

er-27 : non-ACC: NO PNX record exists in Primo (XREF for ACC DS NOT exists!)

er-28 : NO PNX record exists due to failure at harvesting/nep wrn-2 : Nr. of eDOC objects not equal to nr. of V_enrichm. records

Reports

eDocXray4Primo - checks data related to full text indexing in all three systems:

• eDOC

• Aleph

• Primo

Sample of report

--- Bib.ID | SYSNR |eDocObjs| PNXs|XREFs|EXTs|V-ENRs|Status --- AC07662934 : 007456651 : 0 : 1 : 1 : 1 : 1 : er-3 AC07055340 : 007105200 : 1 : 1 : 0 : 0 : 1 : er-4 AC06490014 : 006352144 : 1 : 1 : 0 : 0 : 0 : er-4

Bibl. records having links to eDOC

Perl, Oracle, MySQL / ~ 2 m/month

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 15

(16)

Runnig data supply and validation tools

The all three tools run daily

• Object data corrections („655 - requests“) runs before PPS

• PPS starts at 22:00 followed by Primo pipes (10.000 - 40.000 records daily)

• Data validation and corrections runs after Primo update

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 16

(17)

Primo data supply summary

• If a complex or intensive processing of data for Primo is needed – you may do it on [MAB] XML files produced by the Aleph Publishing

Mechanism, and after that pass it to Primo pipes

• Implement a stable data supply from heterogeneous sources and plan implementation of consistent data changes along with its regular

validations  this saves hours of analysis of complex data problems!

• Management of full texts in Primo does not support yet some

repository operations (e.g. the deletion of objects we do by our tools)

• We avoid pipes for bringing full texts into Primo – we import objects outside the pipes (scales well and brings other advantages for us)

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 17

(18)

Other Primo Tools

Samples

• Statistics tools for institutions

• Back office tools (“attempts to compensate” a lack of its multi tenancy) back-up of configuration data, reports on changes, finding

differences between prod. and staging systems

• Single Sign On (Shibboleth)

• Other consortium specific tools

Note. SQL access to Primo Oracle is used often in our tools

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 18

(19)

OBV Frontend Erweiterungen

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 19

(20)

Überblick

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 20

(21)

Beispiele Layoutänderungen + Goodies

Trennung von Dokumenttyp-Icons und Umschlagbildern + Links in der Kurzanzeige

Navigation in der Kurzanzeige: Permalink:

z.B. http://permalink.obvsg.at/AC07023679 Individuelle Hilfeseiten für jede View:

(HTML und CSS)

Keine Tabs bei FRBRized Treffer:

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 21

(22)

Beispiele Anreicherungen des PNX

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 22

(23)

Beispiele Dynamische Einbindung neuer Services

Neue Tabs (Wikipedia, Google Books, Ebooks On Demand):

Addthis.com

Tooltips

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 23

(24)

Technische Herausforderungen

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 24

(25)

Konzept: Dynamisches Laden zusätzlicher Daten

Primo UI (brief, full, eshelf)

OBV ‚Webservice

Primo

DB CACHE

RVK Online API

Google Books API DBPedia

1: HTTP Request (JS AJAX onReady)

2: Webservice (Fast CGI) -> PNX aus DB lesen -> Cache

-> HTTP-Request ext. APIs 3: JSON-Antwort auswerten und Services integrieren

-> JQuery -> Tab API

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 25

(26)

Details zur Implementierung

showPnx=true zu langsam

• -> serverseitige Komponente notwendig (X-Service oder DB-Request)

Externe Services

• Unterschiedliche APIs

• Google Books API, RVK Online API, …

• SPARQL (Linked Data) Ideallösung, aber technische Probleme (Verfügbarkeit und Performance) -> daher Workaround:

• Regelmäßiges Downloaden der Daten -> Arbeiten mit lokalen Daten!

• Caching der Resultate der Abfragen der ext. APIs (derzeit wöchentlich)

FAST CGI

• Auf anderem Rechner -> Cross Domain Requests -> daher JSONP

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 26

(27)

Wikipedia-Tab

Vorrausetzungen:

• PND im PNX-Datensatz (z.B. display/lds34)

• Short-Abstracts der DBPedia: http://dbpedia.org/Downloads

• Verwendete PNDs in der Wikipedia: http://toolserver.org/~apper/pnd.txt

• URL-Redirect: http://toolserver.org/~apper/pd/person/pnd-redirect/de/[PND]

Workflow: Existiert im PNX eine PND und wird die PND in der Wikipedia verwendet, wird der Wikipedia-Tab erzeugt

Optimierungen:

• DBPedia-SPARQL-Endpoint nutzen

• Mehr Informationen im Tab, beispielsweise Bilder

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 27

(28)

Buchvorschau-Tab

Vorrausetzungen:

• ISBN im PNX-Datensatz (z.B. display/identifier)

• Google Books API

Workflow: Existiert im PNX mind. eine ISBN wird die Google Books API anhand der ISBN abgefragt. Existiert in Google Books für diese ISBN eine Teil- oder Vollansicht, wird der Google Books Tab generiert.

Optimierungen:

• Vorschau direkt in Lightbox in Primo

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 28

(29)

Übersicht verwendete Plugins

(+ Snippets) + addthis.com + Tab API

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 29

(30)

Zusammenfassung Frontend-Erweiterungen

JSP-Änderungen schwierig

• Konsortial-Umgebung und Wartung

JS Grenzen

• Performance und eventuelle Konflikte mit Exl-JS

• Alte JQuery-Version

• Wartung

PNX-Enrichment vs Dynamische Erweiterungen mit JS

Arbeiten mit Plugins

Ideen (aber keine Ressourcen)

• RTA für zentrale View

• mehr PushTo Formate -> Zusammenarbeit in Primo-Community?

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 30

(31)

Vielen Dank für die Aufmerksamkeit

Victor.Babitchev@obvsg.at Ulrike.Krabo@obvsg.at

OBVSG: Primo Entwicklertreffen, Dresden 08.11.2011 31

Referenzen

ÄHNLICHE DOKUMENTE

Nella prima tappa, la superficie utile verrà aumentata del 70% in cifra fonda a circa 32 mila metri quadri dalla costruzione di un nuovo laboratorio di ricerca e di sviluppo,

Walter Wegmüller ci dimostra, con la vita, che nulla va perduto, che portiamo tutto in noi e che ci vuole solo un po' di coraggio per esternare ciö che conosciamo nel nostro

Fino ad un anno fa Silvia Bernasconi aveva il suo studio a Spiez, ma un giorno si rese conto deU'impossibilità di continuare la vita divisa fra casa e studio.. Stanca del

«Più tardi abbiamo dovuto rimborsare questo denaro», ricorda con amarezza Erika Streit, delusa della sua patria, che credeva diversa.. primi anni in Svizzera

Croce Rossa Svizzera ha affidato la creazione del manifesto délia Croce Rossa 1987 a un corso di grafica délia scuola Arti e Mestieri di San Gallo.. Il bozzetto definitivo è di

Mano mano che piovono colpi, che il supplizio si fa più raffinato, che sente di raggiungere i limiti délia propria resistenza, la vittima percepisce forse tutta la profondità del

Qui non è in grado di vivere della sua professione; infatti i tibetani che vivono nel nostra paese non hanno la possibilité di pagare le Thangka dipinte da Tamnyen, poiché questi

• No efficient solution to produce Primo data sources for institutions from the Central Catalog. “existing options” bring big