DATA SCIENCE PARTNERSHIP
+44 (0) 208 133 0822
SMACK Stack – A Modern Tool Set for Data Science
Date: May 8, 2017
|
Posted by:

What Is SMACK?

The combination of various powerful data science tools is not something new. However, even though most data science tool sets focus on delivering the key aspects of data analytics for big data scenarios, some of them go beyond that. One such tool set that is able to handle not just big data but also event processing and doing so in a very fast manner, is the SMACK stack, which stands for the names of the tools it comprises of: Spark (the main processing engine of the framework), Mesos (the container of the whole ecosystem), Akka (the data model), Cassandra (the system handling storage and retrieval of data), and Kafka (the broker).

Usefulness of the SMACK Stack

SMACK manages to combine a lot of features that make it very useful for many niche data science tasks. For example, Spark and Akka enable you to build data analysis pipelines that can handle both large data files as well as event processing. What’s more, they are able to work around any latency restrictions you may have and yield a throughput within the desired specifications. As for the coordination and administration of the various tasks, Mesos has you covered, although other scheduling systems such as Yarn could also be used. When it comes to the persistence and the distribution of events, you can rely on Cassandra, while Kafka can take care of anything related to event transport.

Although SMACK does not support modern technologies like Julia that are bound to be the norm in data science in the years to come, it does handle conventional tech such as Scala, Java, Python and R. Moreover, the whole framework of data governance (Spark) is significantly faster than the Hadoop one. What’s more, the whole stack is open-source, so you don’t have to worry about licensing and other fees, making the whole framework very cost-effective (your only cost would be hiring and/or training the data scientists who will make use of it).

In addition, SMACK is adept at handling stream data, which is quite common, particularly for companies that make use of dynamic data, such as that found on web logs. Yet, big data tends to be diverse as well (one of the Vs of big data stands for Variety), something that SMACK can also handle, due to its unique architecture. So, regardless of your big data problem, SMACK is bound to able to help you solve it.

Conclusions and Next Steps

SMACK is very popular today not because of some new innovative tech but because it combines the features of various technologies that enable it to bring a lot of value to the big data you have access to. Yet, it cannot do everything by itself, no matter how robust each one of its components is. Just like any other big data framework, SMACK requires knowledgeable and competent data science professionals, in order to make the most of it. Data Science Partnership is in the position to provide you access to such professionals, so that you too can gain the most of this powerful tool set. Feel free to reach out to us for more information.

Read more
Boosting Cybersecurity with the Use of A.I.
Date: May 3, 2017
|
Posted by:

Introduction

Cybersecurity, the set of frameworks geared towards protecting computers and computer networks from malicious software and unwarranted access to computer systems, is a major concern across almost all industry sectors. After the recent hacks of various sites’ servers thought to be “secure” people have started to realize that the concept of security in the computer world is quite relative. Despite the tremendous efforts from the security experts side, the malicious hackers (aka crackers) appear to catch up to the newest security measures fairly quickly. Moreover, the fact that they have access to more and more computing power only makes things easier for the intruders. Could it be that Artificial Intelligence (A.I.) holds the key to keeping these cybercriminals at bay?

Challenges of Modern Cybersecurity

Before we tackle this question, let’s look at the problem of cybersecurity more closely and attempt to identify the issues we need to tackle to have a secure computer ecosystem. First, the various attacks on computer systems and networks are quite diverse and relatively unique, making them very hard to pinpoint accurately. What’s worse, they can mutate rapidly making them tough to tackle again, once identified. And most importantly, the majority of modern systems are very complex and the threats they may receive are very hard to track with a conventional rule-based cybersecurity system. All these issues make cybersecurity a very challenging task, while the stakes of it delivering protection against malicious hackers have never been higher.

Intelligence to the Rescue

Even if all these issues appear daunting, things are hopeful in cybersecurity, thanks to Artificial Intelligence and sophisticated systems based on it, such as those employing IBM’s Watson. Some of the ways Artificial Intelligence benefits cybersecurity is:

  • Faster and more effective remediation. Basically, this involves immediate notification about potential security breaches and steps to intervene, when possible. This way whatever damage the breach causes is kept at a bare minimum.
  • Better risk level evaluation. Determining the actual risks involved in a compromising of a system’s security is very hard, especially in sophisticated networks. Artificial Intelligence can tackle this challenge efficiently and thoroughly, oftentimes without continuous human supervision.
  • More accurate threat identification. The vastness of potential threats would make it unfeasible to identify them accurately and in a timely manner. However, Artificial Intelligence can aid in that in a way unimaginable before. This is particularly useful for large organizations that usually attract malicious hackers.
  • More effective use of available data. This enables the A.I.-based cybersecurity systems to learn and evolve constantly. With data becoming increasingly abundant, particularly with the Internet of Things movement, Artificial Intelligence systems like deep learning, that require a large amount of data to work effectively, are becoming better and more practical, making the available data more useful in fending off cybersecurity issues.

The best part is that Artificial Intelligence technologies are evolving too, so A.I. is bound to get even better at all these, making it a more robust defense against cyber-attacks.

The Flipside of A.I. in All This

Things are not all rosy, however. The misuse of Artificial Intelligence is apparent, not to mention dangerous, when it comes to black-hat hacking. For example, Artificial Intelligence based systems have been used in improving network attacks, as well as phishing using ransomware to collect data in order to emulate a user’s writing style to trick his contacts into clicking on some suspicious attachment the hacker sends to them. Also, Artificial Intelligence is used maliciously when it comes to vulnerability searching, a process otherwise slow and difficult that enables black-hat hackers to exploit a computer system.

Summing Up

Not everything is doom and gloom, however. As more and more effort goes into Artificial Intelligence research, through initiatives like Elon Musk’s OpenAI, and with more people gaining awareness about A.I. tech, it is possible to thwart the cyber intrusions and protect your data and software. This is not possible with an out-of-the-shelf cybersecurity system though. To ensure an effective use of Artificial Intelligence in cybersecurity it is best to have an expert onboard, even as a consultant. Data Science Partnership offers access to such experts, so that you too can benefit from the great things Artificial Intelligence has to offer in this ever-significant domain. Please feel free to contact us for a free one hour initial consultation with Data Science Partnership.

Read more
Applications of Deep Learning to Real-World Analytics Scenarios
Date: April 4, 2017
|
Posted by:

What Is Deep Learning?

Many people talk about Deep Learning these days but few actually know what they are talking about. Fortunately there is a plethora of articles and books available on the topic. Yet, you don’t need to go anywhere else to find out more about this fascinating part of data science since Data Science Partnership is one of the places where this technology flourishes. In essence, Deep Learning is a series of machine learning methods that employ Artificial Intelligence (AI) in the form of sophisticated Artificial Neural Networks (ANNs), to tackle complex problems. There are a couple of catches though. First of all, in order to do something useful with Deep Learning you need to have a lot of data. Also, in order to make the most of this data using Deep Learning, you need to have people who know the ins and outs of Deep Learning, since it is a quite complicated process to get a large ANN to do its magic.

Why Is Deep Learning Relevant?

Unlike many other technologies that seem to appear whimsically in the data science field without taking roots, Deep Learning is here to stay as it provide substantial benefits to the organizations that utilize this technology. The reason is simple. Deep Learning manages to accelerate the whole process of insight discovery known as the data science pipeline. That’s not to say that it fully automates the whole process though. Contrary to what many AI evangelists claim, all AI technologies are heavily dependent on specialized experts who have a solid grasp of data science and AI, as well as a decent business acumen. Machines have grown more robust in the past years but the idea of them becoming autonomous is still in the realm of science fiction.

Moreover, Deep Learning systems outperform conventional machine learning systems, as they manage to obtain a better generalization in whatever problem they are tackling. Part of the reason why is that they employ an entirely data-driven approach, making them unbiased and versatile (something inconceivable for statistical data analysis systems). The improvement in performance is also quite noticeable, making Deep Learning a viable option for any data analytics project

How Does Deep Learning Apply in the Real World?

So, how does all this benefit you and your organization? Well, it can offer you the means to obtain better performance in your data science projects if you are involved in one or more of the following domains:

  • Data mining
  • Signal processing and analysis
  • Image recognition
  • Sound analytics
  • Other domains of high complexity

Naturally, the industries that have the most to gain from this technology are:

  • Finance
  • Retail
  • Telecommunications & IT
  • Aerospace & Defense
  • Media and Advertising
  • Medical
  • Automotive
  • Industrial
  • Oil, Gas, and Energy

Also, although the majority of organizations employing Deep Learning are in North America, there is a lot of interest in other parts of the world, particularly Europe (mainly U.K., France, and Germany), Eastern Asia (mainly China, South Korea, India, and Japan), and other parts of the globe (e.g. Middle East and Latin America).

The expected worth of the market Deep Learning taps into in the next 5 years is around 1.7 Billion USD, which is significantly greater than what it is today. Therefore, Deep Learning is much more than a fad, while its level of adoption in the industry and its integration in the data science field are only going to grow from now on.

Next Steps?

There are several options to harness the power of Deep Learning for your data analytics projects. For starters, you can get in touch with Data Science Partnership to either hire an expert in this field, or if you have a team in place already and wish to merely upgrade its know-how, learn how you can apply Deep Learning to the data available to you through a few training sessions. The latter can be in your venue or online, depending on your requirements and schedule. Whatever you decide to do, we are happy to facilitate you in making the most of this promising and robust technology.

Read more
Zacharias Voulgaris joins Data Science Partnership as Head Of Content
Date: March 15, 2017
|
Posted by:

Getting On-board @ Data Science Partnership

I’d like to take this opportunity to announce that from now on I’ll be part of the Data Science Partnership team, as the Head of Content. Although the company focuses more on hands-on projects and on-site training for enterprises, we feel that it would be a good idea to share certain materials with everyone who takes time to visit our site. After all, technology-related information is everyone’s domain and unlike academics who like to hide all their knowledge behind research papers inaccessible to the everyday person, we at Data Science Partnership prefer to share what is non-sensitive information on this tech with the world freely and make it comprehensive to the majority. This not only allows for fewer misunderstandings about the ever-changing fields of Data Science / Machine Learning / Artificial Intelligence, but also can help inspire others to get involved in these technologies, and benefit the world through them.

Through this role, I aim to contribute to all that and help promote these fascinating technologies, along with news about them. I’m hoping to constantly refine my methods through your feedback, so if you have any comments or questions about any of the Data Science / Machine Learning / Artificial Intelligence topics I’ll be covering, please share them either via the comment section below, or via our ‘contact us’ page.

Throughout my life I’ve been fascinated with data analytics and artificial intelligence. Even in my undergraduate degree I spent a whole semester working on developing models based on some financial data I dug up from some lengthy volumes, during my capstone project (aka thesis). Also, during my PhD, I worked with machine learning and A.I. based heuristics in order to improve the classification methodology. Afterwards, during my post-doc I worked with predictive analytics models using sensor data for a couple of military projects that were undertaken by the lab I worked at. In the industry I worked with all kinds of data, mainly financial, to build predictive models for various use cases. Also, over the past half a decade or so, I’ve authored a couple of data science books and several videos on various data science related topics. Finally, parallel to this blog, I have my own personal blog, which I use to share the latest updates of my data science educational material and posts on various topics that interest me.

I look forward to sharing my enthusiasm about this field through various articles on recent developments and other relevant topics. So, bookmark this page if you haven’t already, and stay connected!

Read more