This Week in Scalability: Why Your Work May No Longer Be (Statistically) Significant

“Distributed systems are never ‘up’; they exist in a constant state of partially degraded service. Accept failure, design for resiliency, protect and shrink the critical path.” — Charity Majors.
At first, we were a little bit worried about posting our science writer Kimberly Mok’s remarkable story about an effort to use artificial intelligence techniques and tools to reconstruct forgotten memories. Mok takes us through the thinking that could actually make such a thing possible, albeit remotely possible. The scientists figure that an MRI could scan people’s reaction to seeing photos of folks they know, recording subtle changes in their “cerebral circulation.” The resulting data set can then be used as a model to create entirely new faces/images, which, in turn, could be shown back to the participants to see if any of them spark any similar peaks of “cerebral circulation” (Love that term).
As you can see, there are a lot of “<insert miracle here>” holes in this particular workflow. And we were wary because this week, because an alert reader, “kpolo” had (1) left a comment in another one of our AI stories admonishing us to “Stop labeling things they are not,” meaning we should stop calling software intelligent when it clearly is not.
This is not AI’s first ride on the hype cycle.
It is true that we are entering into a pretty serious hype cycle, with every vendor grafting the abbreviation “AI” onto their marketing decks to describe products whose connection to intelligence, artificial or otherwise, can be tenuous. “It’s tiring to see everything labeled as ‘AI,’” kpolo wrote. “Nothing so far resembles intelligence. This is all just statistical data manipulation.”
Indeed, AI relies heavily on statistical analysis, but at the same time, statistics describe phenomena that mere binary evaluators might miss, and that alone is worthy of exploration. If it takes a buzzword to get some government or VC funding to explore this space, so be it: This is not AI’s first ride on the hype cycle. And something like memory reconstruction is indeed a “Hail Mary” of AI moves, but that’s why we are so intrigued. It could lead to some interesting results, intended or otherwise.
Data, report pic.twitter.com/AobgrnYf9P
— Swear Trek at Ottawa Comiccon (@swear_trek) July 28, 2017
This Week in Links:
- University of Buffalo Murat Demirbas compared distributed machine learning platforms and found that Spark is more CPU intensive than Tensorflow and the Parallel Machine Learning System (PMLS), though “As far as bottlenecks is concerned, network still remains as a bottleneck for distributed [machine learning] applications.
- Speaking of bottlenecks, people bitch that Slack is a hella resource hog, especially for running multiple channels.
- Netflix offers to the world Vectorflow, a lightweight neural network library.
- Cargo Cult Data Science: Organizations should stop chasing technology and start working with experienced techs to solve organizational problems, argues Skyfii lead data consultant Richard Weiss.
- Here’s your weekend project: “Getting Started With Raspberry Pi 1, Zero, or Zero W and Node.js.”
This Week in Podcasts:
American Association for the Advancement of Science’s “Science Podcast” discusses a new way to make sense of large data sets — by “sonifying” them, or turning them into songs. Pennsylvania State University Researcher Mark Ballora is doing this work. it could provide big data researchers “a different way of attacking the data,” said Science Editor David Grimm. Bonus segment: Evaluations are underway for sharpening the threshold for statistical significance, the “p-value,” from 0.05 to 0.005.
Feature image: Street Art, Chelsea, NYC