Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/4 #7

Closed
wants to merge 2 commits into from
Closed

Fix/4 #7

wants to merge 2 commits into from

Conversation

josephlewis42
Copy link
Contributor

fixes #4 by adding a new field file_prefix that does server-side filtering.

This can be tested with the pre-release: https://storage.googleapis.com/logstash-prereleases/logstash-input-google_cloud_storage-0.12.0-java.gem

@josephlewis42 josephlewis42 added the enhancement New feature or request label Oct 10, 2019
@tmegow
Copy link

tmegow commented Oct 10, 2019

I rebuilt using the custom version of the GCS plugin (via the gem file). I did not change the pipeline config values, and I'm seeing these errors during execution:

Error: no method 'list' for arguments (org.jruby.RubyString,org.jruby.java.proxies.ArrayJavaProxy) on Java::ComGoogleCloudStorage::StorageImpl
  available overloads:
    (com.google.cloud.storage.Storage.BucketListOption[])
    (java.lang.String,com.google.cloud.storage.Storage.BlobListOption[])
  Exception: NameError
  Stack: /usr/share/logstash/vendor/local_gems/7a400dae/gcs/lib/logstash/inputs/cloud_storage/client.rb:33:in `list_blobs'
/usr/share/logstash/vendor/local_gems/7a400dae/gcs/lib/logstash/inputs/google_cloud_storage.rb:86:in `list_processable_blobs'
/usr/share/logstash/vendor/local_gems/7a400dae/gcs/lib/logstash/inputs/google_cloud_storage.rb:69:in `list_download_process'
/usr/share/logstash/vendor/local_gems/7a400dae/gcs/lib/logstash/inputs/google_cloud_storage.rb:62:in `block in run'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:20:in `interval'
/usr/share/logstash/vendor/local_gems/7a400dae/gcs/lib/logstash/inputs/google_cloud_storage.rb:61:in `run'
/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:309:in `inputworker'
/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:302:in `block in start_input'
[DEBUG] 2019-10-10 18:23:08.304 [[transcripts]<google_cloud_storage] googlecloudstorage - Closing {:plugin=>"LogStash::Inputs::GoogleCloudStorage"}
[DEBUG] 2019-10-10 18:23:08.305 [[transcripts]<google_cloud_storage] pluginmetadata - Removing metadata for plugin 8cedb02bc64a2e82c736b877d216f90bf7e7c12c7c9c676f06b7ac97919018b6
[INFO ] 2019-10-10 18:23:08.305 [[transcripts]<google_cloud_storage] googlecloudstorage - Fetching blobs from shsp-sales-dialer-dev
[ERROR] 2019-10-10 18:23:08.307 [[transcripts]<google_cloud_storage] javapipeline - A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:transcripts
  Plugin: <LogStash::Inputs::GoogleCloudStorage bucket_id=>"shsp-sales-dialer-dev", json_key_file=>"/sales-dialer/creds/gcp_service_account.json", codec=><LogStash::Codecs::JSON id=>"json_da07edc0-42c8-45d9-9107-8b647abf4f5e", enable_metric=>true, charset=>"UTF-8">, metadata_key=>"x-goog-meta-logstash-transcripts", interval=>60, id=>"8cedb02bc64a2e82c736b877d216f90bf7e7c12c7c9c676f06b7ac97919018b6", file_matches=>"transcriptions/.*json", enable_metric=>true, file_exclude=>"^$", delete=>false, unpack_gzip=>true, temp_directory=>"/tmp/ls-in-gcs">

@josephlewis42
Copy link
Contributor Author

@tmegow, just to make sure did you pick up the second change I made as part of the commit? https://github.com/logstash-plugins/logstash-input-google_cloud_storage/pull/7/files#diff-c74e95df46ddcc4954cdb6235bbd0793R24-R32

@tmegow
Copy link

tmegow commented Oct 10, 2019

[ERROR] 2019-10-10 20:42:04.393 [Converge PipelineAction::Create<blocks>] registry - Problems loading a plugin with {:type=>"input", :name=>"google_cloud_storage", :path=>"logstash/inputs/google_cloud_storage", :error_message=>"\n\n\tyou might need to reinstall the gem which depends on the missing jar or in case there is Jars.lock then resolve the jars with `lock_jars` command\n\nno such file to load -- com/google/cloud/google-cloud-storage/1.62.0/google-cloud-storage-1.62.0 (LoadError)", :error_class=>RuntimeError, :error_backtrace=>["uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/jar_dependencies.rb:356:in `do_require'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/jar_dependencies.rb:265:in `block in require_jar'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/jar_dependencies.rb:307:in `require_jar_with_block'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/jar_dependencies.rb:264:in `require_jar'", "/usr/share/logstash/lib/bootstrap/patches/jar_dependencies.rb:6:in `require_jar'", "/usr/share/logstash/vendor/local_gems/2a2dab14/gcs/lib/logstash-input-google_cloud_storage_jars.rb:4:in `<main>'", "org/jruby/RubyKernel.java:987:in `require'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:in `require'", "/usr/share/logstash/vendor/local_gems/2a2dab14/gcs/lib/logstash/inputs/cloud_storage/client.rb:1:in `<main>'", "org/jruby/RubyKernel.java:987:in `require'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:in `require'", "/usr/share/logstash/vendor/local_gems/2a2dab14/gcs/lib/logstash/inputs/cloud_storage/client.rb:5:in `<main>'", "org/jruby/RubyKernel.java:987:in `require'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:in `require'", "/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:191:in `legacy_lookup'", "/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:166:in `block in lookup'", "org/jruby/ext/thread/Mutex.java:165:in `synchronize'", "/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:162:in `lookup'", "/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:216:in `lookup_pipeline_plugin'", "/usr/share/logstash/logstash-core/lib/logstash/plugin.rb:143:in `lookup'", "org/logstash/plugins/PluginFactoryExt.java:203:in `plugin'", "org/logstash/plugins/PluginFactoryExt.java:120:in `buildInput'", "org/logstash/execution/JavaBasePipelineExt.java:50:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:24:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:in `block in converge_state'"]}
[DEBUG] 2019-10-10 20:42:04.397 [Converge PipelineAction::Create<blocks>] registry - Problems loading the plugin with {:type=>"input", :name=>"google_cloud_storage"}
[ERROR] 2019-10-10 20:42:04.489 [Converge PipelineAction::Create<blocks>] agent - Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:blocks, :exception=>"LogStash::PluginLoadingError", :message=>"Couldn't find any input plugin named 'google_cloud_storage'. Are you sure this is correct? Trying to load the google_cloud_storage input plugin resulted in this error: Problems loading the requested plugin named google_cloud_storage of type input. Error: RuntimeError \n\n\tyou might need to reinstall the gem which depends on the missing jar or in case there is Jars.lock then resolve the jars with `lock_jars` command\n\nno such file to load -- com/google/cloud/google-cloud-storage/1.62.0/google-cloud-storage-1.62.0 (LoadError)", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:221:in `lookup_pipeline_plugin'", "/usr/share/logstash/logstash-core/lib/logstash/plugin.rb:143:in `lookup'", "org/logstash/plugins/PluginFactoryExt.java:203:in `plugin'", "org/logstash/plugins/PluginFactoryExt.java:120:in `buildInput'", "org/logstash/execution/JavaBasePipelineExt.java:50:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:24:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:in `block in converge_state'"]}

Output when building the gem file:

$ gem build logstash-input-google_cloud_storage.gemspec
WARNING:  open-ended dependency on logstash-codec-plain (>= 0) is not recommended
  if logstash-codec-plain is semantically versioned, use:
    add_runtime_dependency 'logstash-codec-plain', '~> 0'
WARNING:  open-ended dependency on stud (>= 0.0.22) is not recommended
  if stud is semantically versioned, use:
    add_runtime_dependency 'stud', '~> 0.0', '>= 0.0.22'
WARNING:  open-ended dependency on mimemagic (>= 0.3.3) is not recommended
  if mimemagic is semantically versioned, use:
    add_runtime_dependency 'mimemagic', '~> 0.3', '>= 0.3.3'
WARNING:  open-ended dependency on logstash-devutils (>= 0.0.16, development) is not recommended
  if logstash-devutils is semantically versioned, use:
    add_development_dependency 'logstash-devutils', '~> 0.0', '>= 0.0.16'
WARNING:  See http://guides.rubygems.org/specification-reference/ for help
  Successfully built RubyGem
  Name: logstash-input-google_cloud_storage
  Version: 0.12.0
  File: logstash-input-google_cloud_storage-0.12.0-java.gem

Output when installing plugin via the gem file:

Installing logstash-input-google_cloud_storage
Installation successful

@josephlewis42 I rebuilt the plugin gem ensuring to include both commits in this PR. Now I'm getting this error. Do I have a disconnect between google_cloud_storage and logstash-input-google_cloud_storage?

@tmegow
Copy link

tmegow commented Jan 24, 2020

@josephlewis42 Were you able to see this working in your testing? I am excited for the possible reduction to our ingress traffic.
Are the built commands I used from my test attempt needing ammending?

@generatives
Copy link

I have also run into trouble trying to use this branch, the same error as @tmegow. Is there chance this PR will be fixed and merged?

@generatives
Copy link

generatives commented Jan 7, 2021

@jsvd @kares Tagging you because you seem to be working more actively on Logstash plugins. Is there any possibility of this branch being merged? The change is very valuable for large GCS buckets, in my case the plugin is basically unusable without server side filtering.

@aksakalmustafa
Copy link

@josephlewis42 This is a great PR. I've implemented similar functionality in my local and using it. Is there any plan to merge this branch? Otherwise, I'll create a similar PR. Thanks!

@josephlewis42
Copy link
Contributor Author

@aksakalmustafa go ahead! I'm no longer actively working on this repository and I bet the PR is stale, I'll close this one so yours can take center stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

file_matches - why so much traffic
5 participants