Bug Hunting in Hyrax
I recently had to find a bug in a Hyrax application, and I thought it might be helpful if I documented the process.
1. Define the problem
I like to start a bug hunt with a clear description of what exactly I’m trying to solve. A great place to do this is in a ticket on the team’s board. In this case, the problem was that when new works were submitted to Hyrax via the create work form, the visibility was not being persisted. I find it helpful to write out the bug definition as a user story:
As a content contributor, when I submit a new work through the form, I want to be able to make it public so that it will be visible to the world. Instead, the work always ends up being marked as restricted / private.
2. Write a test. Or several.
I wrote a test to check whether the issue was with visibility in general, or whether it was specific to form submission. One big question for me with this bug is whether the problem is with the form submission or the object creation process, so I first wrote a test to see what happens when we create an object without the form.
I like FactoryBot for writing tests, and in a Hyrax app I always make a Hyrax::UploadedFile
factory like this:
FactoryBot.define do
factory :uploaded_file, class: Hyrax::UploadedFile do
file { Rack::Test::UploadedFile.new("#{::Rails.root}/spec/fixtures/image.jp2") }
user_id { 1 }
end
end
And here is my test to sanity check the actor stack, to ensure that, assuming all the parameters reach it, it will behave as expected. This test passed, which tells me that the problem isn’t in the actor stack:
require 'rails_helper'
include Warden::Test::Helpers
RSpec.describe 'Sanity check the actor stack', type: feature, js: false, clean: true do
context 'a new object created without using the submission form' do
let(:user) { FactoryBot.create(:user) }
let(:article) { FactoryBot.build(:article, depositor: user.user_key, visibility: "open") }
let(:uploaded_file) { FactoryBot.create(:uploaded_file, user_id: user.id)}
let(:attributes_for_actor) { { uploaded_files: [uploaded_file.id] } }
it "saves the expected visibility" do
env = Hyrax::Actors::Environment.new(article, ::Ability.new(user), attributes_for_actor)
Hyrax::CurationConcern.actor.create(env)
expect(Article.count).to eq 1
post_actor_stack_article = Article.last
expect(post_actor_stack_article.visibility).to eq "open"
end
end
end
So, interesting: I haven’t been able to write a failing test for this bug yet. Let’s try again, this time using the form:
# Generated via
# `rails generate hyrax:work Article`
require 'rails_helper'
include Warden::Test::Helpers
# NOTE: If you generated more than one work, you have to set "js: true"
RSpec.describe 'Create a Article', type: feature, js: true, clean: true do
context 'a logged in user' do
let(:user_attributes) do
{ email: 'test@example.com' }
end
let(:user) do
User.new(user_attributes) { |u| u.save(validate: false) }
end
let(:admin_set_id) { AdminSet.find_or_create_default_admin_set_id }
let(:permission_template) { Hyrax::PermissionTemplate.find_or_create_by!(source_id: admin_set_id) }
let(:workflow) { Sipity::Workflow.create!(active: true, name: 'test-workflow', permission_template: permission_template) }
before do
# Create a single action that can be taken
Sipity::WorkflowAction.create!(name: 'submit', workflow: workflow)
# Grant the user access to deposit into the admin set.
Hyrax::PermissionTemplateAccess.create!(
permission_template_id: permission_template.id,
agent_type: 'user',
agent_id: user.user_key,
access: 'deposit'
)
login_as user
end
it do
visit '/dashboard'
click_link "Works"
click_link "Add new work"
# If you generate more than one work uncomment these lines
choose "payload_concern", option: "Article"
click_button "Create work"
expect(page).to have_content "Add New Article"
click_link "Files" # switch tab
expect(page).to have_content "Add files"
expect(page).to have_content "Add folder"
within('span#addfiles') do
attach_file("files[]", Rails.root.join('spec/fixtures/image.jp2'), visible: false)
attach_file("files[]", Rails.root.join('spec/fixtures/jp2_fits.xml'), visible: false)
end
click_link "Descriptions" # switch tab
fill_in('Title', with: 'My Test Work')
fill_in('Creator', with: 'Doe, Jane')
fill_in('Description', with: 'A brief description of my article')
select('In Copyright', from: 'article_rights_statement')
# With selenium and the chrome driver, focus remains on the
# select box. Click outside the box so the next line can't find
# its element
find('body').click
select('Article', from: 'article_resource_type')
# With selenium and the chrome driver, focus remains on the
# select box. Click outside the box so the next line can't find
# its element
find('body').click
choose('article_visibility_open')
expect(page).to have_content('Please note, making something visible to the world (i.e. marking this as Public) may be viewed as publishing which could impact your ability to')
check('agreement')
click_on('Save')
expect(page).to have_content('My Test Work')
expect(page).to have_content "Your files are being processed by Hyrax in the background."
expect(Article.count).to eq 1
article = Article.last
expect(article.visibility).to eq "open"
end
end
end
If you do any Hyrax development, you’ll notice that this test is mostly the Hyrax-generated feature spec, with a few small tweaks. I had already been spending time ensuring that the feature specs were running for this project, so I had already spent some time tweaking this and I knew it was green. All I added were the last two lines:
article = Article.last
expect(article.visibility).to eq "open"
And this test FAILS! Hooray, a test that fails for the right reasons. I have isolated my bug and I now have clear criteria for what it means for this bug to be fixed, along with a check to ensure it doesn’t creep back in as a regression. Now, let’s actually find the bug.
3. The bug hunt
My primary tool on a bug hunt is the byebug
command. I drop it into various places in the code, run my test, and see if I hit my byebug. Once I’m in my byebug console, I poke around to see what I can see. If everything looks normal, I move on.
First stop: WorksControllerBehavior
On our tour of what happens to the data after it is submitted via the Hyrax form, we pass briefly through our locally generated controller. We’ve just submitted an article, so the controller in question is app/controllers/hyrax/articles_controller.rb
. However, upon examination it’s clear there isn’t much there. Our local controller gives us a place to put overrides that might be specific to that Work type, but unless we’ve customized it most of its behavior is going to come via an include:
include Hyrax::WorksControllerBehavior
I’m looking for a create
method into which to drop a byebug, so I go to find the copy of Hyrax::WorksControllerBehavior
that my test is executing. In this case, because we use a docker container development envionment at Notch8, the process looks like this:
> bess@lorca > docker-compose exec web bash
root@bc6de9202513:/data# bundle show hyrax
/usr/local/bundle/gems/hyrax-2.9.0
root@bc6de9202513:/data# cd /usr/local/bundle/gems/hyrax-2.9.0
root@bc6de9202513:/usr/local/bundle/gems/hyrax-2.9.0# vi app/controllers/concerns/hyrax/works_controller_behavior.rb
I find the create
method, and drop a byebug in at the beginning. I run my test again and I see:
[51, 60] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/controllers/concerns/hyrax/works_controller_behavior.rb
51: build_form
52: end
53:
54: def create
55: byebug
=> 56: if actor.create(actor_environment)
57: after_create_response
58: else
59: respond_to do |wants|
60: wants.html do
(byebug) params
<ActionController::Parameters {"utf8"=>"✓", "article"=>{"title"=>["My Test Work"], "creator"=>["Doe, Jane"], "description"=>["A brief description of my article"], "resource_type"=>["", "Article"], "rights_statement"=>"http://rightsstatements.org/vocab/InC/1.0/", "abstract"=>"", "bibliographic_citation"=>[""], "contributor"=>[""], "date_created"=>[""], "date_issued"=>"", "extent"=>[""], "funder"=>[""], "funder_identifier"=>[""], "grant_award"=>[""], "grant_number"=>[""], "grant_uri"=>[""], "identifier"=>[""], "institution_organization"=>[""], "issue"=>"", "keyword"=>[""], "language"=>[""], "license"=>[""], "note"=>[""], "peer_review_status"=>"", "publisher"=>[""], "related_resource"=>[""], "rights_notes"=>[""], "school"=>[""], "source"=>[""], "subject"=>[""], "title_alternative"=>[""], "volume"=>"", "admin_set_id"=>"admin_set/default", "member_of_collection_ids"=>"", "visibility"=>"open", "visibility_during_embargo"=>"restricted", "embargo_release_date"=>"2020-10-24", "visibility_after_embargo"=>"open", "visibility_during_lease"=>"open", "lease_expiration_date"=>"2020-10-24", "visibility_after_lease"=>"restricted"}, "uploaded_files"=>["1", "2"], "new_group_name_skel"=>"Select a group", "new_group_permission_skel"=>"none", "new_user_name_skel"=>"", "new_user_permission_skel"=>"none", "agreement"=>"1", "locale"=>"en", "controller"=>"hyrax/articles", "action"=>"create"} permitted: false>
So, first question answered: The visibility IS being sent from the form, and at the time it enters the actor stack on line 56 visibility does indeed equal open.
Time to dive into the actor stack.
A quick tour of the actor stack
Here is a quick way to find out what actors your object will traverse in the actor stack:
(byebug) Hyrax::CurationConcern.actor
#<Hyrax::Actors::TransactionalRequest:0x0000557a92022148 @next_actor=#<Hyrax::Actors::OptimisticLockValidator:0x0000557a92022170 @next_actor=#<Hyrax::Actors::CreateWithRemoteFilesActor:0x0000557a92022198 @next_actor=#<Hyrax::Actors::CreateWithFilesActor:0x0000557a920221c0 @next_actor=#<Hyrax::Actors::CollectionsMembershipActor:0x0000557a920221e8 @next_actor=#<Hyrax::Actors::AddToWorkActor:0x0000557a92022210 @next_actor=#<Hyrax::Actors::AttachMembersActor:0x0000557a92022238 @next_actor=#<Hyrax::Actors::ApplyOrderActor:0x0000557a92022260 @next_actor=#<Hyrax::Actors::DefaultAdminSetActor:0x0000557a92022288 @next_actor=#<Hyrax::Actors::InterpretVisibilityActor:0x0000557a920222b0 @next_actor=#<Hyrax::Actors::TransferRequestActor:0x0000557a920222d8 @next_actor=#<Hyrax::Actors::ApplyPermissionTemplateActor:0x0000557a92022300 @next_actor=#<Hyrax::Actors::CleanupFileSetsActor:0x0000557a92022328 @next_actor=#<Hyrax::Actors::CleanupTrophiesActor:0x0000557a92022350 @next_actor=#<Hyrax::Actors::FeaturedWorkActor:0x0000557a92022378 @next_actor=#<Hyrax::Actors::ModelActor:0x0000557a920223a0 @next_actor=#<Hyrax::Actors::InitializeWorkflowActor:0x0000557a920223c8 @next_actor=#<Hyrax::Actors::Terminator:0x0000557a92022468>>>>>>>>>>>>>>>>>>
Let’s break that out into something easier to read:
- Hyrax::Actors::TransactionalRequest
- Hyrax::Actors::OptimisticLockValidator
- Hyrax::Actors::CreateWithRemoteFilesActor
- Hyrax::Actors::CreateWithFilesActor
- Hyrax::Actors::CollectionsMembershipActor
- Hyrax::Actors::AddToWorkActor
- Hyrax::Actors::AttachMembersActor
- Hyrax::Actors::ApplyOrderActor
- Hyrax::Actors::DefaultAdminSetActor
- Hyrax::Actors::InterpretVisibilityActor
- Hyrax::Actors::TransferRequestActor
- Hyrax::Actors::ApplyPermissionTemplateActor
- Hyrax::Actors::CleanupFileSetsActor
- Hyrax::Actors::CleanupTrophiesActor
- Hyrax::Actors::FeaturedWorkActor
- Hyrax::Actors::ModelActor
- Hyrax::Actors::InitializeWorkflowActor
- Hyrax::Actors::Terminator
Hyrax::Actors::CreateWithRemoteFilesActor
:
[11, 20] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/actors/hyrax/actors/create_with_remote_files_actor.rb
11: class CreateWithRemoteFilesActor < Hyrax::Actors::AbstractActor
12: # @param [Hyrax::Actors::Environment] env
13: # @return [Boolean] true if create was successful
14: def create(env)
15: byebug
=> 16: remote_files = env.attributes.delete(:remote_files)
17: next_actor.create(env) && attach_files(env, remote_files)
18: end
19:
20: # @param [Hyrax::Actors::Environment] env
When I examine the params at this point in the stack, I see something interesting: visibility is gone.
(byebug) env.attributes
{"title"=>["My Test Work"], "abstract"=>nil, "bibliographic_citation"=>[], "contributor"=>[], "creator"=>["Doe, Jane"], "date_created"=>[], "date_issued"=>nil, "description"=>["A brief description of my article"], "extent"=>[], "funder"=>[], "funder_identifier"=>[], "grant_award"=>[], "grant_number"=>[], "grant_uri"=>[], "identifier"=>[], "institution_organization"=>[], "issue"=>nil, "keyword"=>[], "language"=>[], "license"=>[], "note"=>[], "peer_review_status"=>nil, "publisher"=>[], "related_resource"=>[], "resource_type"=>["Article"], "rights_notes"=>[], "rights_statement"=>"http://rightsstatements.org/vocab/InC/1.0/", "school"=>[], "source"=>[], "subject"=>[], "title_alternative"=>[], "volume"=>nil, "remote_files"=>[], "uploaded_files"=>["1", "2"]}
When I examine the nascent object that the actor stack is acting upon, I see that it is mostly empty, and has the default visibility value, which is restricted
:
(byebug) env.curation_concern
#<Article id: nil, head: [], tail: [], depositor: nil, title: [], date_uploaded: nil, date_modified: nil, state: nil, proxy_depositor: nil, on_behalf_of: nil, arkivo_checksum: nil, owner: nil, date_migrated: nil, format: [], bibliographic_citation: [], date_issued: nil, extent: [], institution_organization: [], note: [], related_resource: [], rights_notes: [], title_alternative: [], abstract: nil, funder: [], funder_identifier: [], grant_award: [], grant_number: [], grant_uri: [], issue: nil, peer_review_status: nil, school: [], source: [], volume: nil, label: nil, relative_path: nil, import_url: nil, resource_type: [], creator: [], contributor: [], description: [], keyword: [], license: [], rights_statement: [], publisher: [], date_created: [], subject: [], language: [], identifier: [], based_near: [], related_url: [], access_control_id: nil, representative_id: nil, thumbnail_id: nil, rendering_ids: [], admin_set_id: nil, embargo_id: nil, lease_id: nil>
(byebug) env.curation_concern.visibility
"restricted"
So, by the time we reach Hyrax::Actors::CreateWithRemoteFilesActor
our “open” visibility value is missing, so it never gets applied to our skeleton curation_concern
, which keeps its default visibility value of restricted
. Hmm... I need to look eariler in the process. I do the same check in Hyrax::Actors::TransactionalRequest
and see that visibility
is also missing there. Where is my missing attribute?
Permitted params
The above got me far enough that I was able to ask a concrete question in the #hyrax channel on Samvera slack, where Tamsin Johnson pointed me to this place where params get sanitized:
[265, 274] in /usr/local/bundle/gems/hyrax-2.9.0/wp-content/controllers/concerns/hyrax/works_controller_behavior.rb
265:
266: # Add uploaded_files to the parameters received by the actor.
267: def attributes_for_actor
268: byebug
269: raw_params = params[hash_key_for_curation_concern]
=> 270: attributes = if raw_params
271: work_form_service.form_class(curation_concern).model_attributes(raw_params)
272: else
273: {}
274: end
And there, on line 271, is where visibility
goes missing. But WHY????
So what’s happening here is that we’re using Hyrax::ArticleForm
to define what the permitted parameters are to get submitted through the form:
(byebug) work_form_service.form_class(curation_concern)
Hyrax::ArticleForm
And if I go look at Hyrax::ArticleForm
I see that self.terms
has been redefined:
self.terms = [
:title,
:abstract,
:bibliographic_citation,
...
Instead of redefining self.terms
here, the pattern I prefer would be to add any customized fields to it, like this example from the RepoCamp curriculum:
When I change Hyrax::ArticleForm
to instead use a +=
, thus keeping all of the fields that come pre-defined in Hyrax, my feature test now passes.
self.terms += [
:title,
:abstract,
:bibliographic_citation,
...
In summary
- Write your tests first
- Keep your end-to-end feature tests green when you’re doing development
- When you customize forms in Hyrax don’t re-define the
terms
, add to it and subtract from it as you like, but you probably don’t want to lose all the terms Hyrax gives you out of the box.