REThink 2018, Week 1

For this iteration of the RETHINK Program, I believe I am going to try weekly blogs instead of daily blogs. I am fortunate enough to be working with Dr. Mark Zarella again. Here is my experience for week 1:

Monday: Believe it or not, I still had school. So I had to miss the first day. It sounds like it was very similar to the first day from 2015, where it was general on-boarding and procedures, as well as countless trips to the ID card office to try and get everyone ID cards.

Tuesday: Came in and met my partner for this year, Laurence. There was a debrief with the group, in which I was able to meet everyone who is participating this year. Laurence spoke with Dr. Mark Zarella the day before, who mentioned that one of the projects would be writing a tutorial for the current algorithms that they have working, as it seems that not many in the academic world are utilizing these algorithms. I then did a literature search to see what Mark et al have been up to. I read the following articles:

“An alternative image representation for the reduced impact of H&E staining variability”
Zarella MD, Yeoh C, Breen DE, Garcia FU
PLoS One 12(3): p. e0174489 (2017)

“A template matching model for nuclear segmentation in digital images of H&E stained slides”
Zarella MD, Breen DE, Xin W, Garcia FU
Proceedings of the 9th International Conference on Bioinformatics and Biomedical Technology, Lisbon, Portugal (2017)

“Contextual modulation revealed by optical imaging exhibits figural asymmetry in macaque V1 and V2”
Zarella MD, Ts’o DY
Eye and Brain (9): 1-12 (2017)

“Cue combination encoding via contextual modulation of V1 and V2 neurons”
Zarella MD, Ts’o DY
Eye and Brain (8): 177-193 (2016)

“Lymph node metastasis status in breast carcinoma can be predicted via image analysis of tumor histology”
Zarella MD, Breen DE, Reza MdA, Milutinovic A, Garcia FU
Anal. Quant. Cytol. Histol. 37(5): 273-285 (2015)

We then set up a meeting with Mark for Wednesday afternoon.

Wednesday: First, I met with Laurence and finished literature review. We then met with Mark over at the Drexel Medical School. He detailed his two wants:

1. A tutorial so that more people will use the algorithms in the academic world. He gave this website as an example: http://www.andrewjanowczyk.com

2. Integration with QuPath, a (relatively) new open source program for pathology, of which there may be a way to integrate his MATLAB scripts.

Laurence took need #1, and I will be working on need #2.

I also found out they are still using the MATLAB script that I got working during my last stint in the REThink Program in 2015.

Thursday: I went over to the Neuroergonomics conference at the Bossone building, and went to the poster sessions. It was very interesting seeing this growing field, and the new devices that they have to measure EEG and fNIRS. I then used the downtime to see about getting the 2015 script running in MATLAB. It wasn’t working as expected, and for the life of me I couldn’t figure out why. I ended up downloading the new BioFormats toolbox for MATLAB. That helped, but it still wasn’t passing the image through as expected to open the viewer.

Friday: I woke up with an idea of how to solve the issue, and immediately made the chnages in MATLAB and it now runs flawlessly. We met up to discuss our Project Proposals, and then went to one of the dining commons for lunch. I tried to set up QuPath for MATLAB, and encountered an issue with the toolbox (which outputs an error and is semi-deprecated.)  I then joined the QuPath Google Group to inquire about any potential solutions to this error.

REThink 2015 Day 20- Terrible PATCO commute, Posters, MATLAB Functions

So this morning was perhaps the worst PATCO commute, due to 2 broken down trains, and the fact that there’s only one track to get across the Benjamin Franklin Bridge. We were delayed for 47 minutes, so I ended up being super late this morning. As per usual, we did our weekly check-in, and I reported about my current progress (see previous post).

We started talking about our Poster Session. If you’re interested in coming out and seeing the work that everyone has done, here’s the information:

Room 326 at the Science Center (3401 Market Street / the Excite Center) on Wed 8/12 at 2pm.

We then talked about the data for each of our projects. Here’s what I wrote:

Input/Output

The input data of my project is ImageScope files, from the Aperio ImageScope slides. The slides are stained using Hematoxylin and Eosin stain (H&E). The slides were made daily

The output data of my project is .mat files, to be analyzed using MATLAB. I will be using the “scatter3” function in MATLAB to analyzed the HSV colorspace of the files.

Data Type

The data type is descriptor  ‘classidx’ and ‘centroid’ variables in each output file to assign a centroid value to each.  The data contains centroid values for nuclei, the centroid values for cytoplasm, the centroid values for stroma and the centroid values of white space.  There are 30 files that contain this information, so I will have 120 data points, 30 files times 4 classes per file.

What are some questions about the data?

Are the nuclei (cytoplasm, stroma, whitespace) always within a certain color space?

Are the variations in slides due to region, preparation, or analysis?

 

This afternoon, I did a bit of MATLAB and just basic matrix research, as that seems to be how I need to analyze this data.

output

output2

outputdata

 

That’s basically what I’m working with now. Let’s hope PATCO doesn’t let me down on my commute home!

 

 

 

REThink 2015 Day 15, Day 16, and Day 17- Super Busy

I didn’t get a chance to blog the last couple of days because a lot has been going on here with the project.

Monday Morning- We had our typical “Progress Report” session, in which we go around and share what we have accomplished, and what we need to accomplish. We then spent some time talking about cryptography. It’s super interesting how it is a cat and mouse game, where the math is essentially made more difficult once computers catch up in their processing capabilities. This worksheet was a blast to complete. and you can encode/decode a message: https://www.cs.drexel.edu/~introcs/Fa14/labs/LB-EndOfTerm/RSAWorksheetv4e.html

cryptography

Monday Afternoon- Between Kathy, me, and Google, we got my program working! It can now classify cell parts based on HSV. This will allow me to look at the 40 images that I have to be able to look at the color space of the prepared H&E slides. I also made the GUI more Drexel (Blue & Gold)

newdrexel

Tuesday-  I worked on classifying pictures all day. There’s not much to report. I did send the images to Mark to analyze to make sure everything was being output correctly, and received the green light.

Wednesday- We spent the morning with Bill, talking about all kinds of computer science processes. We first signed up for Bitbucket accounts, and Bill showed us how to use the service. It’s like Github or even dropbox. My repository wouldn’t upload as my image files are huge, and even though I have the academic version, it was still too big.

We then did some socket programming. Bill’s Instructions:

Part 0 – Downloading the Required Files

  1. Download http://www.pages.drexel.edu/~wmm24/rsa/RSALibrary.zip and http://www.pages.drexel.edu/~wmm24/rsa/SecureChat.zip.
  2. The objective of the following lesson is to give students experience in:
    1. Protocol design, implementation
    2. Modification or augmentation of protocol layers
    3. Algorithm implementation
    4. The RSA Cryptosystem
    5. Importing and using Java libraries
    6. Optionally: Key exchange algorithms, Threading, and Message signatures
    7. Optionally, if running on a wireless network or a wired hub, use Wireshark to “intercept” messages across the wire.

Part 1 – Create Eclipse Project for MiniRSA

  1. Under Eclipse – go to File -> Import, and select “Existing Projects into Workspace” as below.
  2. Unzip the project code, and in Eclipse, select “Select Root Directory” and browse to the “RSA Java with Library” directory as the root.
  3. Select the RSA Project and click Finish

Part 2 – Including the RSAMath Library jar

  1. Right click on the project and click Properties
  2. Click on the “Java Build Path” setting to import the jar, and bring up the Libraries tab across the top.
  3. Click “Add External JARs” and select the rsamath.jar file inside the “RSA Java Library” directory, unzipped as part of your archive.

 

Part 3 – Checking out the MiniRSA Algorithm

As discussed, students will implement (and you are given!) the RSA algorithm.  There are 5 files included in the cs4hs11.rsa package, and they are described here in the order in which you should write and run them:

  • java – this file contains the entire RSA algorithm, from start to finish, and is suitable for describing the algorithm in general terms.
  • java – this class prompts the user for primes (users can enter 2 and 3 to compute the 2nd and 3rd prime number, so they don’t have to come up with their own primes! Note that large numbers are clearly stronger, but are also slower to compute here and crack, so numbers > 10 and < 100 are generally appropriate).
  • java – this class prompts the user for a friend’s encryption key (E, C), and a string, and returns the encrypted values to share with a friend.
  • java – this class is run by a student after receiving an encrypted message from a friend, and prompts the user for the decryption key (D, C), and each encrypted value before decrypting them to the original message.
  • java – this class prompts the user for a friend’s encryption key (E, C), and brute-force determines the decryption key (D, C) associated with it. This is good for discussing how encryption is hiding in plain sight, protecting information with keys, not algorithms.

Within the RSA Java Library directory, you are given a jar file and a javadoc directory, both of which are suitable for giving to students.  The nice thing here is that they can get some practice importing external jar files and using simple interfaces as published in the JavaDoc.  If you want to trade this for more programming practice and algorithm implementation, by all means feel free to have students implement the functions inside the library.  They are:

Part 4  – Running MiniRSA

  1. To run MiniRSA, first generate a public/private key pair by running GenKey.java (right click on the file, and select Run As -> Java Application).
  2. Write your Public Key on the board or Google Doc, and privately save your Private Key. Here, my public key is (451, 2623) and the private key is (1531, 2623).
  3. Have a friend encrypt a message to you by running Encrypt.java with your public key, and store the encrypted numbers into the Google Doc (or on the board).
  4. Once you have received a message, decrypt it in similar fashion by running Decrypt.java with your private key.
  5. Now, let’s crack our friend’s key by running RSACracker.java with the public key found in the Google Doc:

 

Part 5 – Encrypted Chat Application

  1. You are given, as before, the full solution to work with. However, students versions may vary widely here depending on the “protocol” that they choose with you in the classroom.  The important thing is that they:
    1. Decide on a protocol
    2. Implement a simple chat client and server based on that protocol
    3. Optionally add threads to the sending and receiving behaviors so that they can chat “in parallel” and not have to wait for a message to arrive before sending. (this can be provided for students as well so they can appreciate a more fully functional chat app!).
    4. Implement the RSA algorithm as above
    5. Generate keys and incorporate encryption into the sending and receiving of their messages
    6. Optionally pass the keys to one another at startup – but then, this would be insecure! This could create discussion about key exchange algorithms for more advanced students (i.e., perhaps they could research this and do a short presentation).
  2. As before, unzip the contents of SecureChat.zip and import this project directory into your workspace. You will find several files:
  • java – The chat server creates a server socket, listens for a friend to connect via a chat client, spawns threads to send and receive encrypted messages. It takes in 6 parameters: the port number to listen on, the friend’s public key E, the friend’s public key modulus C, your private key D, your private key modulus C, and a 1 or 0 to enable or disable verbose debugging messages.
  • java – Similarly, the chat client takes in 7 parameters, and differs only in the first, which is the IP address of the server peer to connect to. Otherwise, it also takes the port to connect to, public and private key information, and a 1 or 0 for debug messages.
  • java – This class serves double-duty: it takes in a DataStream (either in or out) and, depending on which one it was given, reads from the socket or writes to the socket. So it is a threaded chat sender OR a threaded chat receiver (the main program will create two of these).  This one already includes encryption and decryption support.
  • java – This is the key generator class from the MiniRSA package
  • java – This contains the RSAMath library functionality formerly provided as a jar package.
  1. Create a run configuration for the server by right clicking on ChatServer.java, and selecting Run As->Run Configurations.
  2. Click the Arguments tab and provide the arguments as described above, using the public and private key information determined from the MiniRSA activity:
  3. A friend can do the same for RSAClient; give your friend your IP address and port number by running ipconfig from a command prompt (but don’t exchange private keys!):
  4. Set the debug flag to 1 to show that it really is sending encrypted traffic across the network:

 

Part 6 – Interception with Wireshark

This part can be tricky because most wired networks are on switches or routers, as opposed to hubs, so traffic is generally not repeated to all hosts.  Therefore, it may be hard to capture traffic from all students.  Wireless cards that support it can be configured in Promiscuous mode, and Wireshark can be set up to capture the traffic by going to Capture -> Interfaces, and clicking Start next to your interface (here, mine was Microsoft).

And filtering on tcp.port == your_port_number (for the chat application).  Mine was 12345

Note also that, if using the localhost interface, you should actually type in your local IP address and not localhost in the Chat program.  This is because Wireshark does not capture localhost traffic generally.  One can bypass this by forcing a route for local IP  traffic, by opening a command line as Administrator and entering the following Route command:

(note the corresponding route delete afterwards which should be run after the lesson).  The command only needs to be run once.  You will notice TCP traffic generated when you send or receive a message.  Here is mine (click on the packet, then expand the Data part in the middle of the screen).

Notice the last bytes are 98, 97, 10684, which are the encrypted bytes that were sent.  This can be fed to the RSACracker from earlier, along with the friend’s public key, to decrypt other messages that the friend receives.

Socket programming was pretty interesting. I spent the afternoon going through and classifying my MATLAB (or I guess, Imagescope) H&E slides.

REThink 2015 Day 14, Work from Home, MATLAB Images

issues

This one is going to be (even) shorter than yesterday, as I’m about to get out the door to drive to Massachusetts.

I’m trying to fix the same issue that I was having yesterday. We tried the MATLAB functions:
cd(‘Images’);
imagearray = evalc(‘ls *.svs’)
[FILENAMES]= strsplit(imagearray);
FILENAMES = sort(FILENAMES)
FILENAMES = [FILENAMES]

So we’re taking the files in the “images” folder, listing those with the .svs in an array, and then splitting the array. For my computer coding friends, you can most likely tell a Biologist wrote the rest, because it takes the FILENAMES array, sorts alphanumerically, and then outputs that array. (And then…the array is transferred to filenames…all lowercase.) Still, it’s having trouble with the Bioformats library reading those files, so that’s what I need to work on on Monday

REThink 2015 Day 13, Meeting and Project Update, MATLAB

I’m going to keep this one brief today.

Kathy and I went to the Med School. We presented our current project details. Here’s where we are:

Me:

I have the program (mostly) working, but it’s still not as “generic” as I’d like. I took out many of the hardcoded locations, but some are still a bit of a problem. The standalone does run, and does save the data. Mark would like to check the output data to see if me cleaning up the code has changed any of the output variables.

Kathy:

Kathy just received her eye tracker, and is just beginning to play around with the API. She also received a Windows Machine, since the API is Windows Only. She bought the $495 Gazepoint tracker, and is hoping to have more to report next week.

After we returned, I spent the afternoon with Kathy trying to get MATLAB to read the images from the “Images” folder. It isn’t currently working…

REThink 2015 Day 12, Machine Learning, Arduino, More MATLAB

Today started off by talking about Machine Learning. In fact, we started using the WEKA program to begin our machine learning about Iris plants, and classifying them into groups. The University of California Irvine, has quite a repository of machine learning datasets, available here. I’ve attached the powerpoint on what we did (more like a tutorial): Weka_a_tool_for_exploratory_data_mining (1)

We then worked on an Arduino Project, which was really interesting (and easy for me, as I used Arduinos in a previous RET).  Here’s the sketch, note that you’ll need a few libraries for it to run correctly:

#include <Wire.h>
#include “Adafruit_LEDBackpack.h”
#include “Adafruit_GFX.h”

Adafruit_BicolorMatrix matrix = Adafruit_BicolorMatrix();

void setup() {
Serial.begin(9600);
Serial.println(“8×8 LED Matrix Test”);

matrix.begin(0x70); // pass in the address
}

static const uint8_t PROGMEM
tongue_bmp[] =
{ B01100110,
B10011001,
B10000001,
B10000001,
B10000001,
B01000010,
B00100100,
B00011000 };

void loop() {

matrix.clear();
matrix.drawBitmap(0, 0, tongue_bmp, 8, 8, LED_RED);
matrix.writeDisplay();
delay(50);
;
}

And here is the Teacher Guide to the laboratory (Thanks Suzanne!):Light Lab

Here’s the setup instructions:

IMG_4880

And the code I posted above is for a heart. Happy Wednesday!

IMG_4882  IMG_4877

 

Now I’m spending the rest of the day working on MATLAB, trying to get the GUI to close when pushbutton1 is executed.

 

REThink 2015 Day 11, Major Breakthrough in MATLAB, Program about 70% complete

Today I had a big breakthrough in MATLAB. Late yesterday, I was able to get my for loop working, and started working on fixing certain aspects of the program. The version I was given had many of the files hardcoded- this is a bit sloppy, as the directories will not be the same on everyone’s computers. I ended up making the program more versatile, with folders in the pwd (present working directory) specified at user launch, or via a MATLAB variable.

cancer

I also added more in terms of GUI features. Before, you would click the “Done” Button, and it would not acknowledge anything, and step through to the next photograph to be analyzed. I ended up adding a dialog box to my MATLAB code with:

h = msgbox(‘Data Saved’);

So the enduser would receive a message, letting them know that data had indeed been saved.

cancer2

I also spent some time helping Barbara and Denise out with their Raspberry Pi issues. I’m very interested in the Raspberry pi, so I don’t consider it time wasted at all. They were having some issues with getting it to display headless. One of the first things I realized is that when you are connecting it directly to the ethernet port of your computer, you need to assign it an ip address. So I used this DHCP Server program to do so: http://www.dhcpserver.de/cms/

After that step, I was able to connect via putty, and follow the instructions for getting a VNC server set up. Finally, I made the VNC server autostart when the raspberry pi booted, using these directions here: http://www.instructables.com/id/Setting-up-a-VNC-Server-on-your-Raspberry-Pi/step4/Setting-up-the-Pi-to-Automatically-Start-a-VNC-Ser/ (The vncboot file instructions on the other links DID NOT work for me)

And here it is, running in all of it’s glory!

IMG_4875

REThink 2015 Day 10, “How did my phone get so smart?,” DUCA, and some MATLAB

We spent this morning with Jeff providing our weekly update in regards to our projects. Everyone is at that part of their research where they may have hit a wall- there have been a lot of problems and it’s difficult to find exactly how to fix the problems. It’s very helpful to have such a great group of people to bounce ideas off. You can see that a lot of the people in our group had some ideas to help people. The only problem with my project is that so few people know MATLAB, so there weren’t many ideas of how to fix my problem.

Another interesting fact is that 3 other groups are going to be using WEKA in their projects. I had never heard of WEKA- it’s a data mining software. Take a look: http://www.cs.waikato.ac.nz/ml/weka/

We then talked about big data and machine learning, in regards to smartphones and texting. Phones can learn so much from input, and predict words (and grow their dictionary). We worked on using a dictionary and based on a (mistyped) set of letters, the algorithm could predict what the word actually should have been. It’s a pretty neat exercise, and highly useful for learning language. Apple already does this on a large scale with Siri. It’s based upon the “hidden Markhov model.”

Here’s a screenshot of the activity:

smartphone

And here’s the actual link to the activity: https://www.cs.drexel.edu/~introcs/Fa14/labs/L9-loopsAI/Markov/TheWord2.html (make sure that you import a dictionary!)

We then went over to the DUCA students, and watched them. The were working on Brute-Force cracking a key (they were using small numbers in this scenario, so it is doable.) It was very interesting to see such a range of abilities- some students were comfortable, and some were struggling to understand how exactly the whole thing worked. All of the students, however, were thoroughly engaged in the activity.

I’m now working on trying some more solutions on MATLAB. I’ll post if I have a major breakthrough, but I think I need to rewrite the entire function to make it work.

REThink 2015 Day 9, I still can’t write a “For Loop” that works

I don’t have much in the way of progress today. I spent maybe a few hours trying solutions, and the rest of the day reading search results on Google of how to write a “For Loop” correctly in this MATLAB program. I’ll try again next week!

REThink 2015 Day 8- Project Debrief, Obtain MATLAB files and begin

We began today by taking the Dragon Shuttle to the Med School, which is a bit more difficult than you would imagine. Last week, we went to the wrong stop on 33rd (we were at 33 and Market)- and the stop was moved to 33rd and Ludlow. By the time we realized our error and walked over, we missed the bus, so we had to wait for the 10:15am bus. We did manage to make it to our meeting on time with that bus, but it cut it close (we walked in 10:29 for a 10:30 meeting).

So this week, we skimmed the listings and found out it was about every 15 minutes or so. It turns out, though, that there is no 10am bus. In fact, the (rude) woman made sure to point this out to us very loudly.  So again, we had to take the 10:15am Drexel Dragon Bus. And we made it there right in the nick of time again.

We went through everyone’s projects in the laboratory. It was a smaller meeting than usual, as Dr. Mark Zarella was out of town, and Dr. Fernando Garcia came a bit late due to all of his responsibilities. Everyone’s projects are moving along. Wenyu is working on watershedding the cancer cell images to get better segmentation. I think the flaw is he is using binary images instead of greyscale images, which is making his job difficult.

Ben is working on something- he didn’t have much to report. Zahara is working on trying to find the function that explains the cell clumping with her synthetic data.

Ben was able to grab me some of the data on a flash drive. Andre provided me with the standalone MATLAB file, which needs to be edited, but I managed to get relatively “working.”

I spent the afternoon trying to figure out how to put the correct “For Loop” in MATLAB, with little success. Ultimately, I did managed to get it to random select files, but I can’t sequentially go through them (yet).

firstmatlab