{"id":5638,"date":"2021-07-27T09:45:50","date_gmt":"2021-07-27T07:45:50","guid":{"rendered":"https:\/\/spicasolutions.pl\/?post_type=portfolio&#038;p=5638"},"modified":"2021-07-27T10:49:46","modified_gmt":"2021-07-27T08:49:46","slug":"monitoring-apache-spark-with-hadoop","status":"publish","type":"portfolio","link":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/","title":{"rendered":"Monitoring Apache Spark with Hadoop environments"},"content":{"rendered":"<div id=\"pl-5638\"  class=\"panel-layout\" ><div id=\"pg-5638-0\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-5638-0-0\"  class=\"panel-grid-cell\" ><div id=\"panel-5638-0-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"0\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-1963\" src=\"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png\" alt=\"monitoring apache\" width=\"100\" height=\"100\" \/><\/p>\n<p>Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor processing, especially in distributed environments, when a single application is executed on multiple worker nodes.<\/p>\n<p>&nbsp;<\/p>\n<p>We are the authors of an extension that collects the most important metrics provided by the API Apache Spark, allowing them to be properly correlated with measurements taken by APM tools running on hosts. This allows us to have a live view of which application and executor worked on the given worker. We know how many resources have been allocated, whether the stage or job has ended in an error. Having a complete set of information, we can easily tell with which processing our environment is not doing well, and therefore where we should start optimizing.<\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor processing, especially in distributed environments, when a single application is executed on multiple worker nodes. &nbsp; We are the authors of an extension that collects the most important metrics provided by &hellip;<\/p>\n","protected":false},"author":8,"featured_media":1964,"template":"","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"portfolio_category":[],"portfolio_tag":[],"class_list":["post-5638","portfolio","type-portfolio","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions<\/title>\n<meta name=\"description\" content=\"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions\" \/>\n<meta property=\"og:description\" content=\"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor\" \/>\n<meta property=\"og:url\" content=\"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/\" \/>\n<meta property=\"og:site_name\" content=\"Spica Solutions\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/profile.php?id=100040816125174&amp;amp%3bref=embed_page\" \/>\n<meta property=\"article:modified_time\" content=\"2021-07-27T08:49:46+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png\" \/>\n\t<meta property=\"og:image:width\" content=\"100\" \/>\n\t<meta property=\"og:image:height\" content=\"100\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/\",\"url\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/\",\"name\":\"Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/spicasolutions.pl\\\/wp-content\\\/uploads\\\/2019\\\/05\\\/monit.png\",\"datePublished\":\"2021-07-27T07:45:50+00:00\",\"dateModified\":\"2021-07-27T08:49:46+00:00\",\"description\":\"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/#primaryimage\",\"url\":\"https:\\\/\\\/spicasolutions.pl\\\/wp-content\\\/uploads\\\/2019\\\/05\\\/monit.png\",\"contentUrl\":\"https:\\\/\\\/spicasolutions.pl\\\/wp-content\\\/uploads\\\/2019\\\/05\\\/monit.png\",\"width\":100,\"height\":100},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/monitoring-apache-spark-with-hadoop\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Oferta\\\/klienci Spica\",\"item\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/portfolio\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Monitoring Apache Spark with Hadoop environments\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/#website\",\"url\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/\",\"name\":\"Spica Solutions\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/spicasolutions.pl\\\/en\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions","description":"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/","og_locale":"en_US","og_type":"article","og_title":"Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions","og_description":"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor","og_url":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/","og_site_name":"Spica Solutions","article_publisher":"https:\/\/www.facebook.com\/profile.php?id=100040816125174&amp%3bref=embed_page","article_modified_time":"2021-07-27T08:49:46+00:00","og_image":[{"width":100,"height":100,"url":"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/","url":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/","name":"Monitoring Apache Spark with Hadoop environments &#187; Spica Solutions","isPartOf":{"@id":"https:\/\/spicasolutions.pl\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/#primaryimage"},"image":{"@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/#primaryimage"},"thumbnailUrl":"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png","datePublished":"2021-07-27T07:45:50+00:00","dateModified":"2021-07-27T08:49:46+00:00","description":"Apache Spark technologies along with Apache Hadoop are commonly used for Big Data-level processing. At the same time, it is difficult to properly monitor","breadcrumb":{"@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/#primaryimage","url":"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png","contentUrl":"https:\/\/spicasolutions.pl\/wp-content\/uploads\/2019\/05\/monit.png","width":100,"height":100},{"@type":"BreadcrumbList","@id":"https:\/\/spicasolutions.pl\/en\/monitoring-apache-spark-with-hadoop\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Oferta\/klienci Spica","item":"https:\/\/spicasolutions.pl\/en\/portfolio\/"},{"@type":"ListItem","position":2,"name":"Monitoring Apache Spark with Hadoop environments"}]},{"@type":"WebSite","@id":"https:\/\/spicasolutions.pl\/en\/#website","url":"https:\/\/spicasolutions.pl\/en\/","name":"Spica Solutions","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/spicasolutions.pl\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio\/5638","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio"}],"about":[{"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/types\/portfolio"}],"author":[{"embeddable":true,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/users\/8"}],"version-history":[{"count":5,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio\/5638\/revisions"}],"predecessor-version":[{"id":5647,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio\/5638\/revisions\/5647"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/media\/1964"}],"wp:attachment":[{"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/media?parent=5638"}],"wp:term":[{"taxonomy":"portfolio_category","embeddable":true,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio_category?post=5638"},{"taxonomy":"portfolio_tag","embeddable":true,"href":"https:\/\/spicasolutions.pl\/en\/wp-json\/wp\/v2\/portfolio_tag?post=5638"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}