Computer Science Articles: 2014

Saturday, August 9, 2014

Network Sockets

It seems as though network sockets has generated considerable confusion in the computer world. I feel that that confusion largely exists because the need to directly utilize sockets has pretty much faded. Until recently, I've only heard sockets being used as a networking buzzword and whether or not people actually knew what they were wasn't any of my business. With the arrival of HTML5's WebSockets, however, much more consideration is being given to the word. In this article, I want to first distinguish "sockets" from "WebSockets". Then I will do a little socket programming.

HTML5's WebSockets came about as a result of the interactive web. Several years back the internet was all about serving up static web pages usually filled with information about some topic. Many websites served up a bunch of heavily linked pages often stored in a directory structure that separated them into some understandable hierarchy as determined by the authors. This resulted in us end users seeing rather long paths in the URL sometimes with a lot of slashes. Soon enough those static pages learned how to be a little more interactive with the help of AJAX. Websites turned into web applications. Every status update, blog post, comment, etc. you created or saw was a request sent to the server followed by a reaction on the web page easily carried out by JavaScript.

However, it was the case that much of the "real-time" data we wanted to pass around the interwebs was rather small: a comment like "lol", a few pieces of weather data, etc. The meaningful data on a request had a sizable amount text packaged with it in the form of headers that were sometimes much larger than the data itself. For sending small real-time data back and forth, carrying tons of header info is unnecessarily cumbersome. Additionally, using this method, data that needs to arrive sequentially such as a vehicle's GPS coordinates isn't guaranteed to arrive in the right order. A much more sensible solution would take advantage of the TCP from the suite of protocols known as IP. TCP is ordered and error checked. WebSockets is the protocol and API that resulted from this realization.

Finally arriving at the subject of this article, a socket is simply an endpoint in network communication. The Linux man pages describes the system implementation of sockets in C. The socket method signature is simply:

int socket(int domain, int type, int protocol);

The domain specifies what protocol family is used in communication. In type, we specify whether we want to use SOCK_STREAM (tcp), SOCK_DGRAM (udp), or SOCK_RAW (raw), among others.

Because a socket is an endpoint for network communication, it can be essentially defined very simply as an address and a port. Their usefulness beyond this depends on how you want to send the packets of information. The typical choice is TCP because the packets are numbered and can be reassembled at the recipient's endpoint. For most use cases, people want "real-time" data to arrive in the same order it was sent.

For more information on Linux socket implementation you can visit the man page at http://man7.org/linux/man-pages/man2/socket.2.html

I want to quickly move on to demonstrating socket programming. The following very simple use case for socket programming in PHP should get users started with PHP's scoket functions.

A port scanner application in PHP

The above image is a quick program I wrote in PHP that accomplishes port scanning using PHP socket functions.

First, we create a socket object by specifying that we want to use IPv4 (AF_INET), TCP (SOCK_STREAM), and IPPROTO_IP which is integer value 0. We allows the user to specify the $hostname by the system's command line arguments. It should come in as $argv[1] because $argv[0] is the filename of the script itself. We use PHP's gethostbyname() function to grab the IP address of our host. Then we iterate through all of the potential ports (I know, some of them are not used with TCP) and check whether or not we can create a connection within the specified timeout limit (0.2 seconds). If there are any successful connections as we are printing ports we are currently scanning we put a star next to them. Finally, we print out a list of all the ports we could successfully connect to along with the size of the array.

A similar program can be written in Python (the reader is encouraged to try writing this one). Python includes a socket module that pretty closely mimics the C implementation found in Unix. It likewise creates socket objects based on domain, type, and protocol. It has the same functions that you may have discovered in the Linux man pages such as bind(), accept(), connect(), listen() and close().

Because this article is now ending I want to finish with some points that will make future socket programming for the reader much more enjoyable:

Sockets are endpoints in network communication
TCP is the protocol that keeps your data ordered for the recipient
Sockets depend on underlying system calls written in C as well as protocols which govern the way data is sent so that interconnected devices have a way of understanding them
Sockets are where we listen for and respond to information sent across the web
The TCP type socket usually uses a stream to receive and write out data much like a filestream

Look forward to an an article about how to set up a PHP socket server as I will consider writing that in the future. For now, I felt it necessary to give a brief introduction to network sockets and set them apart from the newly popular buzzword "WebSockets"

Sunday, June 29, 2014

Communicating between Python and PHP

Today, we are going to look at a very useful tool for making computations with your data on the server.

Why we need this article:

PHP is great for communicating with your database, especially creating new data, updating existing data, retrieving data and removing it. However, it is not suitable for making computations based on your data before displaying it to the user. Consider, for example, a web application that receives regular (every minute) GPS coordinates from your mobile phone as you go about each day. Later, when you visit the web application, you want to see how far you traveled by road on July 1st, 2014. Now you might be able to just grab a list of that huge set of coordinates for July 1st via your method of retrieving data from your database in PHP, but the thought of matching it up to the coordinates that constitute all roads in your vicinity is a much more daunting task. We can use more of your server's computational ability by using Python.

I don't want to spend a long time explaining why you should perform complex computations in Python instead of PHP. Let's get straight into how.

In PHP, you will need a function called exec(). This function can execute an external program on your server. On your server, you will need Python and a worthy script.

Good communication practices:

exec() takes an argument which is the php variable into which the output of your program will be placed. Unfortunately, this means that it will be an array of strings. So for every print() statement in your Python script, you will get one element of your string array in PHP.

For example:

exec("python get_coords.py", $data);

$data will come out of this line containing one element for every line you print() in Python.

So if you know that it all comes down to communicating via strings, we have to decided how we're going to appropriately return our data from our Python script. Json sounds like a perfect solution. Import the json module into your py program:

import json

Python has some easy to understand data structures to build your json output with, dicts and lists. Here is a refresher on the two:

# ca = []

# ce = [] # both of these will be lists

ca = get_afternoon_coordinates()

ce = get_evening_coordinates()

coords = { 'afternoon' : ca, 'evening': ce }

As you can see, you can create a pretty nifty nested hierarchy of lists and dictionaries that will translate easily into Json.

When your python program successfully builds this structure, you will want to use the json module to build a json string out of the structure. Then you'll print it as output.

print(json.dumps(coords))

The above line wraps up the responsibility of your Python program to create good data for PHP to parse.

In PHP you will then have access to this as the first element of the array coming back from the exec() command:

$jdata = json_decode($data[0], true);

json_decode is a PHP function that will take a json string and return a PHP variable. Optionally, you will want to specify the 2nd argument, true, which is whether you want the objects return to be associative arrays.

If your outer json structure constructed by your python program was a dictionary, you will now have access to the values via $jdata['mykey'].

Conclusion:

PHP is awesome and can do a lot.
Python is awesome and can do a lot.
Together they can do amazing things that each one couldn't do alone.
Use JSON.

Please feel free to use my techniques to smartly write web applications that use PHP and Python.

Sunday, June 8, 2014

Proof by Anselm: The Breathtaking Bovine & How to Prove Numerous Other Things

I invite you to join me in celebrating the contributions of Anselm to philosophical argument. I begin with a thought experiment:

Imagine with me a cow so beautiful that there is most certainly nothing more beautiful than said cow. We must agree that this breathtaking bovine exists in your mind as well as in mine now. And we can agree that, in our own subjective right, the cow we are imagining is the most beautiful there is because I have said to think so. While we might conjure different elementary images of such a cow, we have already granted it the property of being of utmost beauty. A beautiful cow in our mind and present in our world is even the more beautiful for that we can appreciate it with our eyes. If our bovine exists only in our mind, then we can conceive of a more beautiful bovine in reality. But we cannot imagine a more beautiful cow than the one I instructed you to because we agreed it to be the most beautiful. Therefore, the breathtaking bovine for which there are no more beautiful bovines must exist.

Anselm has given us the means to prove an infinite set of things! Please feel free to use the following strategy to conduct a proof by Anselm:

1. Start with an assumption that is agreeable to you and your audience alike. This can be a definition such as the most beautiful cow. It must be an assumption that grants the thing you wish to prove the existence of an unsurpassable quality. Remember that your audience must agree with you on this.

2. Suggest your audience conceive of the thing you wish to prove in their mind and that they grant their conception this quality you agree that it has.

3. Enforce to your audience the fact that now the thing exists in their mind.

4. Suggest a thing exactly as your audience is now imagining that exists in reality.

5. Inform them that that is a thing whose aforementioned quality surpasses that which they imagined to possess - because it is real. Their conceived version, thusly, is inferior to this new thing.

6. Now the thing that existed in your audience's minds must not only exist in their minds but also in reality for it is there that it achieves its unsurpassable quality.

And so I urge you, go forth and change the world with logic!

Saturday, June 7, 2014

From Procedural to Object Oriented Programming

The leap to Object Oriented Programming from procedural must be made by all aspiring developers. In the words of Stack Overflow user, exclsr, at how-to-teach-object-oriented-programming-to-procedural-programmers, "OOP is one of those things that can take time to 'get', and each person takes their own path to get there."

When I first started learning about object oriented programming two examples were given in a course on Java for a college major I soon elected to abandon (I'll explain later). Both will sound familiar to anyone whomever that has had such elementary instruction. One, the example of a Vehicle from which can be derived Car, Bus, Airplane, etc. Two, the example of an Animal wherewith the derivations Dog, Cat and so forth are fathomable. The instructor followed the examples with a discussion of the importance in memorizing the definition of certain OOP terminology claiming that all programming interviews will ask about three words in particular. The words were encapsulation, polymorphism and inheritance. One can already imagine employers of large companies like Amazon overseeing this abuse and being monumentally disappointed.

Needless to say, I aced this course and soon thereafter transferred whatever credits I could to Computer Science. I make a huge distinction between learning and memorization. And I don't fancy the idea of going back to school in the future when being a walking dictionary for programming words ceases to fool potential employers.

OOP is fundamentally different from procedural programming in various ways. Neither one implies the presence or lack of data structures, another important topic. One might recall writing C programs for a data structures course and using structs to define nodes for a linked list, graph, binary tree, heap and so on. The difference lies in the use of said data structures. A strictly procedural programmer may have become fairly appreciative of structs to represent a real world object, such as a game board, because of its ability to aggregate all the properties of the real world object. Henceforth, the procedural programmer is responsible for defining a set of functions that act upon the data structure (aligning them into a linked list, defining their parents to form a tree, traversals of said structures, ...). OOP says, might the tree be capable of organizing itself? Shouldn't the linked list simply have a function like
bool contains(Node n)? In C#, a List may be constructed of any type whatever, and the coder deservedly may expect to find all the functions he or she desires to act upon the list.

Nowadays ideas like what I just mentioned are commonplace in languages such as C# and Java that call themselves object oriented. An aside about the sarcasm in the last sentence: I've heard programmers tout that they 'know' object oriented programming because such and such language is object oriented. Nonetheless, they use the language in the same mundane, procedural way they would any other language.

A second difference between procedural and object oriented programming is described by the potential for domain decomposition. By this, it is meant that the design of a system is thought of less in terms of what happens first, next and last, but what the components of the system are and how they interact with one another. The problem domain for a pharmacy and general store's inventory/sales software might consist of objects such as Item (from which we get PurchasableItem, RentalItem, PharmacyItem), RentalPolicy, POSTerminal, Inventory, etc. The inventory has some kind of collection of Item objects, functions for adding and removing Item objects and so forth. These are the separate components of what constitutes a larger system. They interact with each other in exactly the way we imagine they should. They are capable of being extended parallel with the evolution of users' needs.

To make the leap from procedural to object oriented programming a bit more of a breeze, I suggest the following progression.

It is ok to start with the silly Vehicle or Animal class, even the popular Shape class (whose derivatives are Rectangle, Circle, and for the bravehearted, the ellipse). Use these to practice with subclasses and to enforce the ideas of inheritance and polymorphism.
But unless you are writing some odd vehicular diagramming tool or a taxonomy application, there needs to be some implementation of solutions to real world problems to bring the understanding home. Using OOP in assignments for college courses is a great way to practice. Even writing games using OOP presents a more realistic scenario.
Look into responsibility driven design.
A topic often connected to OOP is object oriented design patterns an principles. Research these. Patterns aren't created. They exist because they work very well and people have come to realize this over time. Principles should be taken as guidelines for good practice. A common object oriented topic is that of coupling and cohesion. Look into these and why we should seek to lower the former and increase the latter. Many of the design principles exist because they help achieve one or the other or both.
Check out one of the more important offspring of OOP, model-view-controller. Understand why we strive for separation of concerns.
Continue your research in this same vein, interspersing practice projects alongside your research.

You will be more likely to answer any OOP interview question whatever than had you memorized definitions.

Friday, May 30, 2014

Settings: Giving your end user some control

It goes without saying, I've been working on a software project recently. This particular project was not simply a personal endeavor for the purpose of learning but a freelance job for a local organization. As it turns out, I found it crucial in this project to allow the end user (particularly those that would act as admin roles for it) to have various settings within the application, which differs greatly of course from an application I would write for myself.

Software written for commercial purposes, in fact, tends always to have a slew of different settings that can be manipulated by the end user. The premise would seem to be that the developer knows not what the user wants specifically but has only a general idea. As I embarked on this project, I took what I had learned in software engineering courses that teach you to gather requirements, adapt to changes, and meet frequently with the customer as part of the software development lifecycle (SDLC) and applied it to my approach. In all of this, the goal was validation. Are we building the right software? Is this what the customer wants?

One would imagine after the process geared toward constant/frequent validation was over we'd have tailored the system enough to exactly meet the customer's wishes. Why then, should I deem it necessary to include lots of settings?

Settings are a means to adapt to change even after the project is over. One of the settings I included in the project, for instance, was the URL of the client's webpage. Should this ever change in the future when I'm no longer working with the client, they should be able to make the adjustment for themselves, rather than hire a new developer. (Part of the system I built was a lite web browser that kept users within the domain of the client's website).

Settings satisfy a clientele who changes their mind often. I needed to include a setting for the length of time of inactivity on the software to return it to a default view. The client initially thought that 2 minutes was the appropriate length. He eventually changed it to 5, and then to 10. I said in one of the meetings, "How about I let you decide later in the admin interface?" Not surprisingly, that was perfectly agreeable to both parties.

Settings make the product personal. The application my team and I developed for our client was a kiosk application for a tablet connected to a large Samsung display. One of the settings included was what the default screen would say when no one was walking up and using it. I was initially told exactly what the client wanted it to say on that screen. However, I went ahead and included this as an option to be set by the admin if in the future he felt differently. A few prototypes down the road, our client was taking a swing at using it by himself without our guidance and one of the first things he did was take that title and make it his own.

My advice is, giving the end users options shouldn't be optional. Do it. Make the application as customizable as you can as long as it can't be changed so that it no longer serves its purpose. If a client is set on one specific "background color" for a GUI, for instance, it means you need to dust off that color chooser control and cram it in there. Chances are, your client will be ultimately more happy with the fact that he or she gets to choose the color any time he or she wishes versus realizing it doesn't quite look so hot anymore a few weeks down the road.

Thursday, May 29, 2014

Java Card Game Engine at a Glance

Not long ago, I decided to write a quick card game in Java. I wanted to write War because about a year prior, I had helped a friend of mine start learning Python for the first time and we wrote a War game together. Then recently, I started helping him again as he began to take on Java for a new job he scored. The incentive was here once again to write War, and would make for a great example of some basic Java usage/syntax.

My project can be found on GitHub and is free to use: https://github.com/thedouglenz/JavaCardGameSystem/

I wanted to utilize as many object oriented programming tactics as I could, so I started designing and writing my models. The objects I wanted to capture and use from real-world card games were, of course, the idea of a card, a deck, a hand, a player, etc. I even ventured to create a class for a table, which encapsulated among other things, the number of sides of the game table for a given game. I added functionality to the deck such as dealHand and shuffleDeck. I gave every Player a Hand. Every Player the ability to play a card or a collection of cards on the table, the deck the ability to track and report its size, ...

In fact, it continued like that until I was realizing that by this time I was creating more than just the card game War. I had the whole domain model built for potentially lots and lots of card games. I could already sense the basic functionality was present for Crazy Eights, Rummy, Hearts, etc.

The aspiring user would simply have to take my layer of models and write most any game he or she desires by simply building an interface that controls game logic and some kind of GUI. Consequently, I wanted to try this for myself, which brought me back to War. I simply built a WarGUI class that utilized Java's Swing libraries to make the most basic GUI known to man, built a class, WarController, to control the game logic and a class, WarGame, that initialized everything the way I needed it. War obviously has a pretty standard and logically easy game loop. Nonetheless, the models served the purpose very well.

Then my aforementioned friend asked if my system could be used to make poker. While I was tempted to say that it could, I knew that poker was a little bit different than the others. And a non-expert user would still have a lot of work to do if they wanted to use my system to invent a poker game. So I took to writing PokerHandEvaluator5Card. This is one of those classes justified by the general responsibility assignment software pattern (GRASP) known as "pure fabrication". Clearly, in real life there is no such thing as a "poker hand evaluator". Human beings are bright enough to do that work within minutes (or seconds) and dismiss it as trivial that they were able to do it. However, for the purposes of programming a dumb computer, a thing like evaluating a poker hand is a chore. Thusly, my implementation creates a new instance of my evaluator for each hand that needs it and exists thereafter only to report what poker hands are in it.

As you browse through my code and decide if and how you want to use it for your Java card game, feel free to make any additions/modifications. The state of the project at the time of this writing is "in progress" (in quotes because while I say its in progress, I haven't actually touched it in a while as I'm busy with other endeavors now, namely an Android application). Enjoy!

What do method signatures tell us?

A Java method signature

A significant aspect of object oriented programming is deciding what "objects" need to be designed, what data fields (or properties) they will have, and what behaviors (methods) are encapsulated within. For large projects, this usually starts on pencil and paper as opposed to in some IDE. One of the artifacts during what I like to call the "drawing board stage" is a class diagram. In the Unified Modeling Language, a class diagram is a structural representation of a system's classes and the relationships between them. Often, a class diagram will include listings of the properties (and their types) and methods (and their signatures) of a given class. UML is supposed to give us the means to design a system entirely before any code is truly written in a way such that anyone with the UML design plan can implement the system. Hence, the term "unified."

So the question may arise, how much can we ascertain about the implementation of a particular aspect of a software system from a given method signature? What degree of ambiguity does it introduce to the implementation phase?

If you look at a method signature as it is represented in a UML class diagram, you can pick up the followings things: what class it belongs to, what it's name is, what its return type is, what its expected parameters are. For those who are reading this just to answer the title question, let's look at those individual pieces.

What class it belongs to
Keep in mind, you can garner this without having to look at a class diagram. It is obviously also apparent in the code itself. Say your colleague uses a tool that transforms a class diagram into a bunch of classes with their associated properties and method stubs to start you off with before you do the dirty work of actually coding. Clearly, the methods owner is still available to you. Is the class a hint to how the method should be implemented? Maybe so. A method called "speak" that belongs to a Dog class would probably indicate that the intention with "speak" was for the Dog object to bark, as opposed to what "speak" might do in a Cat class.

What its name is
Your biggest ambiguity buster is the method's name. For example, a method called "average" can usually be rightfully assumed to take an average of it's argument(s). However, a method called "fx98234kl" may do any sort of computation known to man.

What its return type is / what parameters it expects
Largely, the implementation of a function whose signature is the only clue you have is the input and output. Take a look at what parameters have been chosen as the input to the method and their names and types. Then examine the return type of the method. Then it becomes much like a physics problem for a student who hasn't studied. He or she may ask, "What are the starting units for this problem? What does the question tell me the ending units should be? What formula will get me there?"

Often we can infer what is intended by a method signature from these 4 things. Say, for instance your colleague designs a class called Flight with a property, departureTime : Date that contains the method delayFlight(int minutes) : Date. What can we reasonably assume?

We might decide that the method body should simply compute the Flight's departure time plus the number of minutes given in the argument and return that result. In a different company, someone does the following sighature: delayFlight(int minutes) : void. Suddenly, we are to write the function such that it computes the same result, but likely we are to set the class' instance variable departureTime right inside the method body.

To answer my title question, we can get a lot of concrete information from method signatures, and can even make some reasonable inferences. However, a method stub is a hole in the plan laid out by the use of UML. Filling the hole and making an ultimate complete design plan for a software system before code is written should include a similar plan at the modular level. A "unified pseudo-code," if you will might solve this problem.