Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is designed to be a quick reference guide for the GOKb data loading and ingest process. For more detailed information on each topic, please refer to the tutorials linked within the page. If you have not already taken training, contact Jennifer Solomon, GOKb Editor, to set up a time.

Load a file into OpenRefine

  • Open OpenRefine and log into the GOKb extension. Choose Create Project from the left-hand menu. Click Browse and locate the file you want to work with. Click Next. OpenRefine will show you a preview of your data. Scan it to make sure everything looks correct.

...

  • Click Create Project in the top right corner. Your project will automatically open.

Check a File Into GOKb

  • Click the GOKb button located in the top right corner of the screen. Select Check in this project for the first time.

  • You will be asked to provide the Source, Provider, Name, Description, and Notes (optional). Click Save and Check In.

Clean up data in OpenRefine

Use Macros to quickly rename columns

  • Click Edit in any cell and then right-click to show the Apply Macro option.

  • Click Apply Macro and then search for the KBART column transformation or the provider name macro (ex: American Society of Chemical Engineers).

  • Double-click the KBART macro and then wait while the application processes. Your columns will be renamed and you should see several error messages disappear.

  • The following three columns will always need to be added, and will require you to look up a controlled value to populate the data: platform.host.name, package.name column, org.publisher.name

  • The following columns are optional, and you may choose to add them if your data happens to contain extra information about these fields. If you are missing this information, you can omit these columns: TIPPPayment, TIPPStatus, Title.OAStatus

  • Address remaining invalid data errors and warnings and Capture repeat errors

  • Review additional fields and load these as custom columns

Ingest a Project Into GOKb

  • In the left hand navigation pane of OpenRefine, navigate to the Errors tab. Click the Update GOKb pane. Proceed by clicking Proceed with Ingest. Make any necessary changes, then click Save and Check-In.

  • Wait until the project is 100% ingested before moving on to the GOKb web app.

Verify a package record

  • In the GOKb web application, use the Search > Packages menu to locate the package you are working with.

  • Confirm that your package name is correct. If you think that people might search for the package using a term that isn't contained in the name, add a variant name.

Populate Update package metadata fields (upper section)

  • List verifier: Check to make sure that you are the list verifier, or update this field with your name.

  • List verifier date: Update this field with today's date (YYYY-MM-DD)

  • Provider: The organization responsible for making the package available. This should be automatically populated with the provider selected in Refine.

  • Source: The location of the original data used to create the package. This should be automatically populated with the source selected in Refine.

  • List verifier: This should be you, or the person you expect to address any review tasks or follow up work associated with this list. Enter the first and last name.

  • List verifier date: This field should show when you set the current status of the list. Dates must be formatted like 2014-01-01. If you change the Edit Status, update this date as well.

  • Edit status: This field describes whether the package itself has been verified and is ready to use.

Populate package details (lower section)

...