The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.
- Kubernetes Cluster >= 1.27
- ArgoCD (Optional)
Create
namespace
, viakubectl create ns web
Assuming you've checked out this repo
kubectl kustomize deployment/ | kubectl apply -f -
Or, to deploy via argocd:
kubectl apply -f deployment/argocd/application.yml
NOTE: Remeber to update
Ingress
hostname
Take it for a test drive:
Via CLI:
You'll need to forward service via
kubectl port-forward -n web svc/tika-ui 8080
curl -d @test/url.json http://localhost:8080/ -H 'Content-Type: application/json'
Or, via Web UI:
Using a browser visit:
http://loclahost:8080/