“Data Lake” is one of the current hot topics in Big Data with lots of press and little practical content. A big part of the confusion and controversy arises from the tension between hype and common sense. Proponents promise phenomenal insights and actionable predictions about customer behavior, employee behavior, doctors’ handwritten diagnostic notes and so on. Basically, they claim you dump documents and data into the lake and let the algorithms make sense of it all. Of course, that’s patent nonsense. Unfortunately, the hype obscures some real value in the data lake concept. In this and following posts we’ll see … Read More
Being able to toggle between dimensions or measures in Tableau is a nice feature to have when building a story for users or just on the dashboard. Being able to give the user the ability to do this is easy and can be done in a couple of steps. The first step to creating a toggle is to “Create Parameter…” in Tableau by right clicking on the left side of the program: To create the parameter here you have to change the data type to String and allowable values to list. Once you’re done with that you can type in … Read More
Graphics in both Tableau and Shiny have their pluses and minuses. Tableau is good for drag and drop graphics that is easy to change the types of graphs given the situation with little to no coding. R/Shiny graphics are completely customizable from 3D graphics to violin plots, which is a boxplot that incorporates a probability density function. Also, with Shiny you can get summaries of the data and perform many other advanced statistical methods; beware of running analyses in Shiny without checking the proper diagnostic tests related to the data at hand (this is my statistics background speaking). Where Tableau … Read More
Development from the RStudio Server is seamless to its desktop counterpart with the only difference being that you access it through your web browser. In fact, RStudio Server has some advantages because all you need is a computer that can connect to the internet and you’re using all of the server’s resources for computations rather than your local machine. This means that you can develop R code using a chromebook or tablet, in fact, I connected to my RStudio server on my iPhone and could code if I wanted but quickly found that it would be more of a hassle than it’s … Read More
RStudio provides a free version on their Shiny Server for all to use. One of the drawbacks of the free version is that it will be public facing and you will not have a login. If you want to use Shiny Server to host sensitive data you will likely want to purchase the license to help protect your data. Getting a Shiny app hosted is easy, all you need to do is move your Shiny files to the following directory: /srv/shiny-server/ Once it’s there you will be able to share your apps with anyone with the following URL http://<hostname>:3838/APP_NAME/.
Making Wind Energy Cost Competitive, sponsored by the Independent Oracle Users Group This is a case study of the Fluitec Wind system that M&S Director of Data Science, Bob Liekar, lead and will be presenting on Wednesday, May 13, 12:45pm–2:00pm EDT. About The Webinar The Fluitec Wind Tribo-Analytics system is a production application that utilizes multiple Big Data disciplines. This case study presents a real-world application of Big Data and IoT (Internet of Things). The techniques and algorithms can be applied to many other industries. From Fluitec: “Wind turbines are expensive to operate. They are remote and distributed with highly … Read More
My philosophy of visualizing data is keep it simple, you want the audience to look at a graph and know within a few moments what it is showing. The key is for accuracy in interpreting the graph and speed; we need to be careful of potentially distorting the data because we don’t want the audience to misinterpret the graph too quickly. An easy way of distorting the data is changing the range of the axis which can cause the audience to believe there are more extreme changes in the data when there actually isn’t and the data is overall stable. … Read More