17 ans à aider les entreprises canadiennes
à choisir le meilleur logiciel

Description de Pachyderm

Pachyderm fournit la couche de données qui permet aux équipes de science des données et de MLOps de produire et d'évoluer tout au long du cycle de vie du machine learning. Grâce à la gestion des versions de données de pointe de Pachyderm, les équipes de pipelines et de lignées bénéficient d'une automatisation axée sur les données, d'une évolutivité du pétaoctet et d'une reproductibilité de bout en bout. Les équipes qui utilisent Pachyderm commercialisent leurs projets de machine learning plus rapidement, réduisent les coûts de traitement et de stockage des données et peuvent plus facilement répondre aux exigences en matière de conformité réglementaire.

Qui utilise Pachyderm?

Pachyderm est la couche de données pour les équipes de science des données et de MLOps ; elle leur permet d'adapter leur cycle de vie de machine learning avec l'automatisation axée sur les données, l'évolutivité du pétaoctet et la reproductibilité de bout en bout.

Pachyderm Logiciel - 1
Pachyderm Logiciel - 2
Pachyderm Logiciel - 3
Pachyderm Logiciel - 4
Pachyderm Logiciel - 5

Pachyderm ne vous convainc pas tout à fait ? Comparer avec une alternative populaire

Pachyderm

Pachyderm

4,0 (7)
Tarification introuvable
Version gratuite
Version d'essai gratuite
43
Intégrations introuvables
3,3 (7)
4,0 (7)
4,9 (7)
VS
À partir de
Types de licence
Fonctionnalités
Intégrations
Simplicité d’utilisation
Rapport qualité-prix
Service client
Le plus commenté

Cority

Produit local
Tarification introuvable
Version gratuite
Version d'essai gratuite
279
68
4,0 (96)
4,2 (96)
4,4 (96)
Les jauges horizontales vertes représentent le logiciel le plus apprécié selon la note globale qui lui a été attribuée ainsi que le nombre d'avis.

Alternatives à Pachyderm

Cority
Fonctionnalités les mieux notées
Gestion des documents
Gestion des flux de travail
Tenue des dossiers OSHA
SAS Viya
Fonctionnalités les mieux notées
Analyse statistique
Intelligence artificielle et apprentissage automatique
Visualisation de données
Wekinator
Fonctionnalités les mieux notées
Aucune fonctionnalité n'a été évaluée par les utilisateurs de ce logiciel.
Centralpoint
Fonctionnalités les mieux notées
Aucune fonctionnalité n'a été évaluée par les utilisateurs de ce logiciel.
Rayven
Fonctionnalités les mieux notées
Aucune fonctionnalité n'a été évaluée par les utilisateurs de ce logiciel.
inconnect
Fonctionnalités les mieux notées
Distribution automatique des appels
Rapports et analyses
Suivi des appels
SAS Customer Intelligence 360
Fonctionnalités les mieux notées
Analyse des campagnes
Planification des campagnes
Suivi des conversions
MySQL
Fonctionnalités les mieux notées
Prise en charge des bases de données
Sauvegarde et récupération
Stockage sécurisé des données
Squirro
Fonctionnalités les mieux notées
Contrôles/Permissions d'accès
Gestion de la base de connaissances
Gestion des documents

Avis sur Pachyderm

Note moyenne

Note globale
4,0
Simplicité d’utilisation
3,3
Service client
4,9
Fonctionnalités
4,6
Rapport qualité-prix
4,0

Avis classés par taille de l'entreprise (nombre d'employés)

  • <50
  • 51-200
  • 201-1 000
  • >1 001

Trouver les avis classés par note

5
14%
4
71%
3
14%
Clayton
Clayton
Lead Software Engineer (É.-U.)
Utilisateur LinkedIn vérifié
Hôpitaux et soins de santé, 10 000+ employés
Temps d'utilisation du logiciel : plus d'un an
Source de l'avis

Rethinking Data in AI and ML

4,0 il y a 3 ans

Commentaires : Like any tool, Pachyderm is no silver bullet for the entire AI/ML stack. However, from a data processing and management perspective, it has fulfilled every application requirement I've needed it for and continues to be a flexible tool in meeting additional requirements. For example, after having computed some results from a pipeline, I needed to serve these results to an existing application. Pachyderm made this simple by exposing the data through a built-in S3 REST API. Since the application was already compatible with S3, Pachyderm served as a drop-in replacement for an S3 bucket. For anyone that strives to design clean and straightforward AI/ML architectures, I can definitely recommend Pachyderm as a must for the foundational data component.

Avantages :

AI/ML production systems typically consist of multiple data processing steps organized as a DAG. Many automation frameworks manage these DAGs as tightly coupled steps ordered by _code execution_. What I like so much about Pachyderm is that it approaches DAG management as loosely coupled steps ordered by _data dependencies_. This alternative way of thinking has enabled me to design AI/ML architectures with data at the center, which has revolutionized the development and production workflows I've participated in. I can confidently store, process, and otherwise manage the data because Pachyderm provides a solid foundation for data provenance, data versioning, data storage patterns, and efficient incremental processing. Since AI/ML models are effectively a form of data, model versioning and management can be built as an extension of Pachyderm's data foundation. Furthermore, I really like that Pachyderm is powered by Kubernetes, because it passes on important architectural properties to Pachyderm, such as high scalability, robustness, efficiency, and portability (i.e. cloud agnosticism). I can containerize my pipelines, quickly test them locally through Docker Desktop or minikube, then scale them up to massive amounts of data in an on-prem or cloud cluster. If autoscaling is supported in a cloud cluster, I can especially reap the benefits of cost efficiency because I only pay for the compute resources I use.

Inconvénients :

- In 1.X versions of Pachyderm, there are a few performance pain points, especially around handling very small files when uploading/downloading to/from a repo. These pain points have been significantly improved in Pachyderm 2.X. - Also in 1.X, debugging pipeline failures can sometimes be challenging without extra tools or integrating external logging services. Pachyderm 2.X improves upon this as well. - When Pachyderm processes data files in a pipeline, it groups the files into logical structures called datums for provenance and data efficiency reasons, and then it invokes the pipeline on each datum. This is necessary for scalability, but the downside is that each invocation of the pipeline incurs an overhead cost of just starting the processing code. The bright side is that there are several straightforward ways to engineer around the problem. It's also important to recognize that the impact of the problem is minimized by the benefits of incremental processing(i.e. only processing data that has changed on future pipeline runs). - This isn't necessarily a problem, but prospective buyers should be aware that although compute costs may go down due to incremental processing, storage costs may go up due to storing multiple versions of data.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Thank you for your very thorough review Clayton.

Cove
Data Scientist (É.-U.)
Recherche, 201–500 employés
Temps d'utilisation du logiciel : plus de deux ans
Source de l'avis

Game changer for handling dynamic data

4,0 il y a 3 ans

Commentaires : Pachyderm meets many previously unmet needs for our organization, including complete data provenance, automatic handling of data change, and modular/portable processing architecture, which facilitates the joint development of processing pipelines between software developers and scientists. Pachyderm engineers have been extremely responsive to our issues and development requests, and we plan to work well into the future with this software.

Avantages :

Perhaps the most important aspect we benefit from operationally is the awareness and automatic handling of data change. Generation of our data products involves multiple processing steps and several sources of data and metadata that enter the processing sequence at various points and may change at any time. Pachyderm automatically knows what has changed and triggers downstream (re)processing, removing the need for error-prone human management.

Inconvénients :

In Pachyderm 1.X there was a relatively high amount of overhead associated with processing each datum. Our data typically consists of small but numerous datums, and we needed to artificially combine datums for performance. However, Pachyderm has been working with us on this issue and we expect to see big improvements in 2.0 and beyond.

Martin
Sr. Data Scientist (Allemagne)
Biotechnologie, 51–200 employés
Temps d'utilisation du logiciel : 6 à 12 mois
Source de l'avis

Great in theory

3,0 il y a 3 ans

Commentaires : We achieved some of our goals with Pachyderm. However, we were really hoping to spend more time on solving the problems directly related with our goal. Instead, we spent a significant amount on time solving problems with Pachyderm and tailoring our problem to it.

Avantages :

Great concept, really fits what we would like to do. Re-computing only the pieces where the data has changed is super valuable.

Inconvénients :

Working with it in practice is very hard. We would like to use Pachyderm also for research, developing research pipelines that can be executed easily on big amounts of data on the cluster. However, during research/development, pipelines naturally crash often. Translating something that works locally to something that works in pachyderm has several scenarios in which it can fail. Inspecting those types of errors is incredibly difficult, unless you invest a significant amount of time into setting up logging/monitoring manually.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Hello Martin, thank you for your feedback, we truly appreciated it. Pachyderm 2 will have several enhancements around the troubleshooting workflow for pipelines and the new Console (dashboard) will likely be of great help here. However, we're striving to further improve the user experience of Pachyderm with every release. Thank you.

Xubo
Staff Data Engineer (É.-U.)
Biotechnologie, 201–500 employés
Temps d'utilisation du logiciel : plus d'un an
Source de l'avis

Pachyderm is a great data processing platform on cloud.

4,0 il y a 3 ans

Commentaires : We have used Pachyderm for more than a year. Overall experience is Good. We love the core technology and features provided by Pachyderm. We experienced frustrated issues, like the download speed, deployment, system stability. We get excellent support from the Pachyderm team all the time.

Avantages :

Data Driven Automation. It supports incremental data processing. Reproducibility. Perfectly match our tech stacks: K8s, S3. Community facing.

Inconvénients :

We expect fully automated data replication/export to external storage system. The logging & debugging support could be improved.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Xubo, Thank you for your review, we greatly appreciate your feedback. We'll make sure to pass your feedback around logging and debugging on to our product team. - Pachyderm

Chris
Director of Engineering and Data Science (É.-U.)
Marketing et publicité, 2–10 employés
Temps d'utilisation du logiciel : 6 à 12 mois
Source de l'avis

Scalable machine learning without the mlops

5,0 il y a 3 ans

Avantages :

The ability to scale model builds in native python is something that has been missing in this space until now. Utilizing spark and/or dask comes with a large amount of overhead that can be avoided leveraging pachyderm.

Inconvénients :

The learning curve is quite steep since there are some core concepts that are foundational to understand before using pachyderm.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Thank you for your review Chris!

Chris
Lead Developer (Canada)
Services et technologies de l'information, 2–10 employés
Temps d'utilisation du logiciel : plus d'un an
Source de l'avis

Pachyderm for data pipelines

4,0 il y a 3 ans

Avantages :

Pachyderm pipelines are an intuitive way to split and process data concurrently using autoscaling compute clusters. Writing a program to interact with data in a pipeline is straightforward due to working similar to a native filesystem, requiring no additional libraries or integrations.

Inconvénients :

We ran into issues with Pachyderm that required deleting and recreating pipelines. As an upside, support was very responsive to resolving our problems and providing upgrades to Pachyderm.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Chris, Thank you for your great feedback. We're glad to hear that our support team has been a great asset to you. We'll make sure to pass along the feedback.

Will
Principle Engineer (R.-U.)
Services et technologies de l'information, 51–200 employés
Temps d'utilisation du logiciel : 6 à 12 mois
Source de l'avis

The missing ingredient for reproducible research

4,0 il y a 3 ans

Commentaires : I'm a big fan of the pachyderm approach; it's young software and needs to be understood a little to get the best out of it; but when stuff works, it works so damn well.

Avantages :

The systematic recording of provenance for training and benchmarking results.

Inconvénients :

When things go wrong, it's hard to diagnose.

Réponse de l'équipe de Hewlett Packard Enterprise

il y a 3 ans

Thank you for the review, Will.