Bug Hunting in Hyrax

Bug Hunting in Hyrax

I recently had to find a bug in a Hyrax application, and I thought it might be helpful if I documented the process.

1. Define the problem

I like to start a bug hunt with a clear description of what exactly I’m trying to solve. A great place to do this is in a ticket on the team’s board. In this case, the problem was that when new works were submitted to Hyrax via the create work form, the visibility was not being persisted. I find it helpful to write out the bug definition as a user story:

As a content contributor, when I submit a new work through the form, I want to be able to make it public so that it will be visible to the world. Instead, the work always ends up being marked as restricted / private.

2. Write a test. Or several.

I wrote a test to check whether the issue was with visibility in general, or whether it was specific to form submission. One big question for me with this bug is whether the problem is with the form submission or the object creation process, so I first wrote a test to see what happens when we create an object without the form.

I like FactoryBot for writing tests, and in a Hyrax app I always make a Hyrax::UploadedFile factory like this:

FactoryBot.define do
  factory :uploaded_file, class: Hyrax::UploadedFile do
    file { Rack::Test::UploadedFile.new("#{::Rails.root}/spec/fixtures/image.jp2") }
    user_id { 1 }

And here is my test to sanity check the actor stack, to ensure that, assuming all the parameters reach it, it will behave as expected. This test passed, which tells me that the problem isn’t in the actor stack:

require 'rails_helper'
include Warden::Test::Helpers

RSpec.describe 'Sanity check the actor stack', type: feature, js: false, clean: true do
  context 'a new object created without using the submission form' do
    let(:user) { FactoryBot.create(:user) }
    let(:article) { FactoryBot.build(:article, depositor: user.user_key, visibility: "open") }
    let(:uploaded_file) { FactoryBot.create(:uploaded_file, user_id: user.id)}
    let(:attributes_for_actor) { { uploaded_files: [uploaded_file.id] } }
    it "saves the expected visibility" do
        env = Hyrax::Actors::Environment.new(article, ::Ability.new(user), attributes_for_actor)
        expect(Article.count).to eq 1
        post_actor_stack_article = Article.last
        expect(post_actor_stack_article.visibility).to eq "open"

So, interesting: I haven’t been able to write a failing test for this bug yet. Let’s try again, this time using the form:

# Generated via
#  `rails generate hyrax:work Article`
require 'rails_helper'
include Warden::Test::Helpers

# NOTE: If you generated more than one work, you have to set "js: true"
RSpec.describe 'Create a Article', type: feature, js: true, clean: true do
  context 'a logged in user' do
    let(:user_attributes) do
      { email: 'test@example.com' }
    let(:user) do
      User.new(user_attributes) { |u| u.save(validate: false) }
    let(:admin_set_id) { AdminSet.find_or_create_default_admin_set_id }
    let(:permission_template) { Hyrax::PermissionTemplate.find_or_create_by!(source_id: admin_set_id) }
    let(:workflow) { Sipity::Workflow.create!(active: true, name: 'test-workflow', permission_template: permission_template) }

    before do
      # Create a single action that can be taken
      Sipity::WorkflowAction.create!(name: 'submit', workflow: workflow)

      # Grant the user access to deposit into the admin set.
        permission_template_id: permission_template.id,
        agent_type: 'user',
        agent_id: user.user_key,
        access: 'deposit'
      login_as user

    it do
      visit '/dashboard'
      click_link "Works"
      click_link "Add new work"

      # If you generate more than one work uncomment these lines
      choose "payload_concern", option: "Article"
      click_button "Create work"

      expect(page).to have_content "Add New Article"
      click_link "Files" # switch tab
      expect(page).to have_content "Add files"
      expect(page).to have_content "Add folder"
      within('span#addfiles') do
        attach_file("files[]", Rails.root.join('spec/fixtures/image.jp2'), visible: false)
        attach_file("files[]", Rails.root.join('spec/fixtures/jp2_fits.xml'), visible: false)
      click_link "Descriptions" # switch tab
      fill_in('Title', with: 'My Test Work')
      fill_in('Creator', with: 'Doe, Jane')
      fill_in('Description', with: 'A brief description of my article')
      select('In Copyright', from: 'article_rights_statement')

      # With selenium and the chrome driver, focus remains on the
      # select box. Click outside the box so the next line can't find
      # its element
      select('Article', from: 'article_resource_type')

      # With selenium and the chrome driver, focus remains on the
      # select box. Click outside the box so the next line can't find
      # its element
      expect(page).to have_content('Please note, making something visible to the world (i.e. marking this as Public) may be viewed as publishing which could impact your ability to')

      expect(page).to have_content('My Test Work')
      expect(page).to have_content "Your files are being processed by Hyrax in the background."
      expect(Article.count).to eq 1
      article = Article.last
      expect(article.visibility).to eq "open"

If you do any Hyrax development, you’ll notice that this test is mostly the Hyrax-generated feature spec, with a few small tweaks. I had already been spending time ensuring that the feature specs were running for this project, so I had already spent some time tweaking this and I knew it was green. All I added were the last two lines:

      article = Article.last
      expect(article.visibility).to eq "open"

And this test FAILS! Hooray, a test that fails for the right reasons. I have isolated my bug and I now have clear criteria for what it means for this bug to be fixed, along with a check to ensure it doesn’t creep back in as a regression. Now, let’s actually find the bug.

3. The bug hunt

My primary tool on a bug hunt is the byebug command. I drop it into various places in the code, run my test, and see if I hit my byebug. Once I’m in my byebug console, I poke around to see what I can see. If everything looks normal, I move on.

First stop: WorksControllerBehavior

On our tour of what happens to the data after it is submitted via the Hyrax form, we pass briefly through our locally generated controller. We’ve just submitted an article, so the controller in question is app/controllers/hyrax/articles_controller.rb. However, upon examination it’s clear there isn’t much there. Our local controller gives us a place to put overrides that might be specific to that Work type, but unless we’ve customized it most of its behavior is going to come via an include:

    include Hyrax::WorksControllerBehavior

I’m looking for a create method into which to drop a byebug, so I go to find the copy of Hyrax::WorksControllerBehavior that my test is executing. In this case, because we use a docker container development envionment at Notch8, the process looks like this:

> bess@lorca > docker-compose exec web bash
root@bc6de9202513:/data# bundle show hyrax
root@bc6de9202513:/data# cd /usr/local/bundle/gems/hyrax-2.9.0
root@bc6de9202513:/usr/local/bundle/gems/hyrax-2.9.0# vi app/controllers/concerns/hyrax/works_controller_behavior.rb

I find the create method, and drop a byebug in at the beginning. I run my test again and I see:

[51, 60] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/controllers/concerns/hyrax/works_controller_behavior.rb
   51:       build_form
   52:     end
   54:     def create
   55:      byebug
=> 56:       if actor.create(actor_environment)
   57:         after_create_response
   58:       else
   59:         respond_to do |wants|
   60:           wants.html do
(byebug) params
<ActionController::Parameters {"utf8"=>"✓", "article"=>{"title"=>["My Test Work"], "creator"=>["Doe, Jane"], "description"=>["A brief description of my article"], "resource_type"=>["", "Article"], "rights_statement"=>"http://rightsstatements.org/vocab/InC/1.0/", "abstract"=>"", "bibliographic_citation"=>[""], "contributor"=>[""], "date_created"=>[""], "date_issued"=>"", "extent"=>[""], "funder"=>[""], "funder_identifier"=>[""], "grant_award"=>[""], "grant_number"=>[""], "grant_uri"=>[""], "identifier"=>[""], "institution_organization"=>[""], "issue"=>"", "keyword"=>[""], "language"=>[""], "license"=>[""], "note"=>[""], "peer_review_status"=>"", "publisher"=>[""], "related_resource"=>[""], "rights_notes"=>[""], "school"=>[""], "source"=>[""], "subject"=>[""], "title_alternative"=>[""], "volume"=>"", "admin_set_id"=>"admin_set/default", "member_of_collection_ids"=>"", "visibility"=>"open", "visibility_during_embargo"=>"restricted", "embargo_release_date"=>"2020-10-24", "visibility_after_embargo"=>"open", "visibility_during_lease"=>"open", "lease_expiration_date"=>"2020-10-24", "visibility_after_lease"=>"restricted"}, "uploaded_files"=>["1", "2"], "new_group_name_skel"=>"Select a group", "new_group_permission_skel"=>"none", "new_user_name_skel"=>"", "new_user_permission_skel"=>"none", "agreement"=>"1", "locale"=>"en", "controller"=>"hyrax/articles", "action"=>"create"} permitted: false>

So, first question answered: The visibility IS being sent from the form, and at the time it enters the actor stack on line 56 visibility does indeed equal open.

Time to dive into the actor stack.

A quick tour of the actor stack

Here is a quick way to find out what actors your object will traverse in the actor stack:

(byebug) Hyrax::CurationConcern.actor
#<Hyrax::Actors::TransactionalRequest:0x0000557a92022148 @next_actor=#<Hyrax::Actors::OptimisticLockValidator:0x0000557a92022170 @next_actor=#<Hyrax::Actors::CreateWithRemoteFilesActor:0x0000557a92022198 @next_actor=#<Hyrax::Actors::CreateWithFilesActor:0x0000557a920221c0 @next_actor=#<Hyrax::Actors::CollectionsMembershipActor:0x0000557a920221e8 @next_actor=#<Hyrax::Actors::AddToWorkActor:0x0000557a92022210 @next_actor=#<Hyrax::Actors::AttachMembersActor:0x0000557a92022238 @next_actor=#<Hyrax::Actors::ApplyOrderActor:0x0000557a92022260 @next_actor=#<Hyrax::Actors::DefaultAdminSetActor:0x0000557a92022288 @next_actor=#<Hyrax::Actors::InterpretVisibilityActor:0x0000557a920222b0 @next_actor=#<Hyrax::Actors::TransferRequestActor:0x0000557a920222d8 @next_actor=#<Hyrax::Actors::ApplyPermissionTemplateActor:0x0000557a92022300 @next_actor=#<Hyrax::Actors::CleanupFileSetsActor:0x0000557a92022328 @next_actor=#<Hyrax::Actors::CleanupTrophiesActor:0x0000557a92022350 @next_actor=#<Hyrax::Actors::FeaturedWorkActor:0x0000557a92022378 @next_actor=#<Hyrax::Actors::ModelActor:0x0000557a920223a0 @next_actor=#<Hyrax::Actors::InitializeWorkflowActor:0x0000557a920223c8 @next_actor=#<Hyrax::Actors::Terminator:0x0000557a92022468>>>>>>>>>>>>>>>>>>

Let’s break that out into something easier to read:

  1. Hyrax::Actors::TransactionalRequest
  2. Hyrax::Actors::OptimisticLockValidator
  3. Hyrax::Actors::CreateWithRemoteFilesActor
  4. Hyrax::Actors::CreateWithFilesActor
  5. Hyrax::Actors::CollectionsMembershipActor
  6. Hyrax::Actors::AddToWorkActor
  7. Hyrax::Actors::AttachMembersActor
  8. Hyrax::Actors::ApplyOrderActor
  9. Hyrax::Actors::DefaultAdminSetActor
  10. Hyrax::Actors::InterpretVisibilityActor
  11. Hyrax::Actors::TransferRequestActor
  12. Hyrax::Actors::ApplyPermissionTemplateActor
  13. Hyrax::Actors::CleanupFileSetsActor
  14. Hyrax::Actors::CleanupTrophiesActor
  15. Hyrax::Actors::FeaturedWorkActor
  16. Hyrax::Actors::ModelActor
  17. Hyrax::Actors::InitializeWorkflowActor
  18. Hyrax::Actors::Terminator
I won’t go into all of those, and some of them are clearly not relevant to this use case. I’m going to drop into the first one that I think might be relevant for me, Hyrax::Actors::CreateWithRemoteFilesActor:

[11, 20] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/actors/hyrax/actors/create_with_remote_files_actor.rb
   11:     class CreateWithRemoteFilesActor < Hyrax::Actors::AbstractActor
   12:       # @param [Hyrax::Actors::Environment] env
   13:       # @return [Boolean] true if create was successful
   14:       def create(env)
   15:        byebug
=> 16:         remote_files = env.attributes.delete(:remote_files)
   17:         next_actor.create(env) && attach_files(env, remote_files)
   18:       end
   20:       # @param [Hyrax::Actors::Environment] env
When I examine the params at this point in the stack, I see something interesting: visibility is gone.

(byebug) env.attributes
{"title"=>["My Test Work"], "abstract"=>nil, "bibliographic_citation"=>[], "contributor"=>[], "creator"=>["Doe, Jane"], "date_created"=>[], "date_issued"=>nil, "description"=>["A brief description of my article"], "extent"=>[], "funder"=>[], "funder_identifier"=>[], "grant_award"=>[], "grant_number"=>[], "grant_uri"=>[], "identifier"=>[], "institution_organization"=>[], "issue"=>nil, "keyword"=>[], "language"=>[], "license"=>[], "note"=>[], "peer_review_status"=>nil, "publisher"=>[], "related_resource"=>[], "resource_type"=>["Article"], "rights_notes"=>[], "rights_statement"=>"http://rightsstatements.org/vocab/InC/1.0/", "school"=>[], "source"=>[], "subject"=>[], "title_alternative"=>[], "volume"=>nil, "remote_files"=>[], "uploaded_files"=>["1", "2"]}
When I examine the nascent object that the actor stack is acting upon, I see that it is mostly empty, and has the default visibility value, which is restricted:

(byebug) env.curation_concern
#<Article id: nil, head: [], tail: [], depositor: nil, title: [], date_uploaded: nil, date_modified: nil, state: nil, proxy_depositor: nil, on_behalf_of: nil, arkivo_checksum: nil, owner: nil, date_migrated: nil, format: [], bibliographic_citation: [], date_issued: nil, extent: [], institution_organization: [], note: [], related_resource: [], rights_notes: [], title_alternative: [], abstract: nil, funder: [], funder_identifier: [], grant_award: [], grant_number: [], grant_uri: [], issue: nil, peer_review_status: nil, school: [], source: [], volume: nil, label: nil, relative_path: nil, import_url: nil, resource_type: [], creator: [], contributor: [], description: [], keyword: [], license: [], rights_statement: [], publisher: [], date_created: [], subject: [], language: [], identifier: [], based_near: [], related_url: [], access_control_id: nil, representative_id: nil, thumbnail_id: nil, rendering_ids: [], admin_set_id: nil, embargo_id: nil, lease_id: nil>
(byebug) env.curation_concern.visibility

So, by the time we reach Hyrax::Actors::CreateWithRemoteFilesActor our “open” visibility value is missing, so it never gets applied to our skeleton curation_concern, which keeps its default visibility value of restricted. Hmm... I need to look eariler in the process. I do the same check in Hyrax::Actors::TransactionalRequest and see that visibility is also missing there. Where is my missing attribute?

Permitted params

The above got me far enough that I was able to ask a concrete question in the #hyrax channel on Samvera slack, where Tamsin Johnson pointed me to this place where params get sanitized:

[265, 274] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/controllers/concerns/hyrax/works_controller_behavior.rb
   266:       # Add uploaded_files to the parameters received by the actor.
   267:       def attributes_for_actor
   268:           byebug
   269:         raw_params = params[hash_key_for_curation_concern]
=> 270:         attributes = if raw_params
   271:                        work_form_service.form_class(curation_concern).model_attributes(raw_params)
   272:                      else
   273:                        {}
   274:                      end

And there, on line 271, is where visibility goes missing. But WHY????

So what’s happening here is that we’re using Hyrax::ArticleForm to define what the permitted parameters are to get submitted through the form:

(byebug) work_form_service.form_class(curation_concern)

And if I go look at Hyrax::ArticleForm I see that self.terms has been redefined:

    self.terms = [

Instead of redefining self.terms here, the pattern I prefer would be to add any customized fields to it, like this example from the RepoCamp curriculum:

When I change Hyrax::ArticleForm to instead use a +=, thus keeping all of the fields that come pre-defined in Hyrax, my feature test now passes.

   self.terms += [

In summary

  1. Write your tests first
  2. Keep your end-to-end feature tests green when you’re doing development
  3. When you customize forms in Hyrax don’t re-define the terms, add to it and subtract from it as you like, but you probably don’t want to lose all the terms Hyrax gives you out of the box.
Bess Sadler
Bess Sadler