html.batch.batchtool-beta1.html Maven / Gradle / Ivy
Show all versions of fcrepo-client Show documentation
Fedora batch tool
Fedora
batch tool
Beta1:? 12/21/2002
Introduction
BatchTool
is a command-line Java application to ingest multiple Fedora objects.
Its
simplest operation is to ingest objects created by one-up edit or by custom
scripting.? This operation will be
described first, and is illustrated in the first demo in the distribution.?
BatchTool
also supports creation of objects, given a batch template and object-specific
substitutions (in XML ?object-specs?) to make.? That operation will be described subsequently, and is illustrated
in the second demo in the distribution.?
(The two demos actually share common object contents and image
data.? There are two demos to illustrate
the different entry levels of BatchTool use.)
Additional
functionality is planned for beta2 to process multiple metadata and media files
from a file directory tree structure, so creating the input files for
the merge phase.
This
phased processing pipeline is shown in the following diagram:
template
?
(tree representation)
?
(PROCESS TREE)
?
object- specs
?
MERGE
?
fedora mets files
?
INGEST
?
fedora repository
?
PID assignment
(processing to be added in beta2)
demo 1 : ingest
demo 2: merge
demo 2: ingest
demo 2: merge+ingest
Windows
operation is supported using the provided fedora-batch.bat file.? A Unix sh script will be provided in a
subsequent beta.? BatchTool may later be
front-ended by the general management/access client or another gui tool.
ingest
Create
a set of Fedora objects in your repository from a corresponding set of METS XML
files.
how to run
set
FEDORA_HOME=c:\mellon\dist\server
c:
cd
\mellon\dist\batch\bin
fedora-batch
c:\mellon\dist\batch\beta1-demo1\ingest.properties
The
single argument is the path to a properties file.? For the first demo of beta1 it has the following values:
ingest=yes
????? #do only this phase of BatchTool
objects=C:\\mellon\\dist\\batch\\beta1-demo1\\objects
#directory holding (all and only) Fedora METS files to ingest
ingested-pids=C:\\mellon\\dist\\batch\\beta1-demo1\\pids
#path to output file to record pids assigned to the files ingested
The
table below describes these properties and gives others you may want to use.
output: PIDs of ingested
objects
The
ingested-pids file has one of the following formats, depending on the value of
property pids-format.? (This property
defaults to xml.)
pids-format=xml
<map-inputids-to-pids>
????? <map
inputid="4868090" pid="test:102429" />
????? . . .
????? <map
inputid="4868099" pid="test:107428" />
<
/map-inputids-to-pids>
pids-format=text
(field
separator is tab):
4868090????? test:102429
. . .
4868099????? test:107428
merge / ingest
First
build a set of METS XML files from a common METS template and simple (non-METS)
XML object-specs.? Then create a
set of Fedora objects in your repository from a corresponding set of METS XML
files.
how to run
set
FEDORA_HOME=c:\mellon\dist\server
c:
cd
\mellon\dist\batch\bin
fedora-batch
c:\mellon\dist\batch\beta1-demo2\merge-ingest.properties
The
properties and values are different for the second demo operation of merging
objects and then ingesting them into the Fedora repository:
merge-objects=yes
????? #first do this phase of BatchTool
ingest=yes
????? #then do this phase of BatchTool
specifics=C:\\mellon\\dist\\batch\\beta1-demo2\\object-specifics
????? #directory holding (all and only) object-specs
whose values????? individually
????? #merge into a template to create the
objects to ingest
template=C:\mellon\dist\batch\beta1-demo2\\beta1-demo2.xml
????? #Fedora METS giving the common structure
of each object of the batch
objects=C:\\mellon\\dist\\batch\\beta1-demo2\\objects
#directory holding (all and only) Fedora METS files to ingest
#these are created by the merge-objects phase
#and are ingested by the ingest phase
ingested-pids=C:\\mellon\\dist\\batch\\beta1-demo2\\pids
#path to output file to record pids assigned to the files ingested
or you
may want to run these phases separately, as is provided by beta1-demo2?s
separate merge.properties and ingest.properties files.
object-specs
This is
a description of the XML files in the directory given by property
?specifics?.? Datastream urls and
metadata for each object to ingest are described in each XML file.? All non-METS namespaces used in metadata
must be declared, as in xmlns:uvalibadmin in the example.? IDs and xlink:hrefs in the example
correspond to like-named attributes in the Fedora METS template.?
Metadata
IDs here map to those found in the Fedora METS:amdSec and Fedora METS:dmdSec
element.? The associated metadata is
substituted as the content of METS:xmlData element, which is nested within that
Fedora METS:amdSec or Fedora METS:dmdSec element.
?
Datastream
IDs here map to those found in the Fedora METS:fileGrp element (the nested, not
the nesting, one).? The associated
xlink:href attribute is substituted into the Fedora METS:Flocat element, which
is nested within that Fedora METS:fileGrp element.
Additional
structure may be added later to support other substitutions, e.g., batch- or
object-specific labels.
<?xml version="1.0"
encoding="ISO-8859-1"?>
<input OBJID="test:2800"
xmlns:METS="http://www.loc.gov/METS/"
xmlns:uvalibadmin="http://www.lib.virginia.edu/uvalibadmin/" >
????? <metadata>
????? ????? <metadata ID="RIGHTS1">
<uvalibadmin:admin>
? <uvalibadmin:adminrights>
?? ???????? ????? ????<uvalibadmin:policy>
????? ????? ????? ??????<uvalibadmin:access>unrestricted</uvalibadmin:access>
????? ????? ????? ??????<uvalibadmin:use>educational</uvalibadmin:use>
?? ???????? ????? ????</uvalibadmin:policy>
? </uvalibadmin:adminrights>
</uvalibadmin:admin>
????? ????? </metadata>
????? ????? . . .
????? ????? other metadata
????? ????? . . .
????? </metadata>
????? <datastreams>
????? ????? <datastream ID="DS1" xlink:href=
http://localhost:8080/beta1-demo/thumb/4868090.jpg
/>
????? ????? . . .
????? ????? other datastreams
????? ????? . . .
????? </datastreams>
</input>
(beta2:? processing metadata and data tree)
The
planned initial processing phase of BatchTool will work with structures like
the following.? The ordering of the
tree, with regard to the placement in tree level of the various represented
Fedora entities, especially of the repeating identification of the various
objects, will be flexible, specified in input by path expressions with
wildcards for object identification.
/datastream
tree base . . . / medium?? / 123.jpg
????????????????????????????????????????
124.jpg
??????????????????????????? / thumb??? / 123.jpg
?????? ??????????????????????????????????124.jpg
/metadata
tree base . . . / medium?? / technical /
123.xml
??????????????????????????????????????????????????
124.xml
????????????????????????? / thumb??? / technical / 123.xml
??????????????????????????????????? ???????????????124.xml
notes
1. Note the case used in ID and
xlink:href attributes.
2. Fedora will retain as PIDs only
OBJIDs prefixed ?test:?.? Other OBJIDs
will be replaced by generated PIDs.
3. You must edit for new OBJIDs before
rerunning the ingest phase of either demo.
4. Note that FEDORA_HOME must have the
value C:\mellon\dist\server (replace ?mellon? as appropriate).? There is a dependency currently requiring
this.
5. BatchTool does not validate, but
ingest into Fedora does.
6. Directories referred to in
properties must exist before BatchTool is run.
7. BatchTool does not clean up files
created in intermediate steps.
8. object-specs must meet the
structural requirements of templates:?
same number and tagging of datastreams, same number and tagging of
metadata elements
9. Fedora will not ingest a METS file
whose METS:xmldata elements are empty or contain non-tagged character data.
10.see
also the object-specs section above
11.When
you create your own test batch, input files can be anywhere in your file space.
?????
properties
Processing
phase
Property
Definition
Valid
values
Merge-objects
Ingest
merge-objects
do
object-merge phase?
yes |
no
required
optional
(default = no)
ingest
do
ingest phase?
yes |
no
optional
(default = no)
required
template
the
common structure of each object of the batch
Fedora
METS object
required
N / A
specifics
directory
holding (all and only) simple XML object-specs
absolute
filepath
required
N / A
objects
directory
holding (all and only) Fedora METS objects to ingest
absolute
filepath
required
required
server-fqdn
Fedora
repository?s
fully
qualified domain name
N / A
optional
(default = localhost)
server-port
Fedora
repository?s
port
N / A
optional
(default = 8080)
ingested-pids
path
to output file to record pids assigned to the files ingested
absolute
path to file
N / A
required
pids-format
formatting
of that file
xml |
text
N / A
optional
(default = xml)
?
?????
?????