PHSchedule

The summer before my senior year at PHS, they announced that they would be switching from a very simple schedule to a much more complicated one shown below:

It honestly wasn’t really that complicated, but there was major outrage, so the schools vice-principal reached out to the robotics club and asked if there was anyone who could make an app that students could use where it would tell them there schedule on a daily/live basis. So when the advisor of the robotics club asked me and one other student if this was possible, we both brushed it off as a trivial task that wouldn’t take longer than 10 hours (spoiler alert: it took much longer). If you don’t care about my explanation of the code/process all the code is open source - with the front-end here and the back-end here. Check out a write up by my friend who made the front end here.

##General Planning/Major Decision making I did this with a friend of mine, he mainly worked on the front end and me the backend, but we both did enough work on both where I can talk about the entire project comfortably.

Both of us had a decent background in iOS (using Swift) development, but since this app would be for the entire school, we needed it to be cross-platform. Since we’re super hipster (and lazy) we decided to use React Native, which is a JS based platform for building cross platform apps (used by companies such as FB and Airbnb). React native can natively export to both android and iOS which were our two main target platforms, so we decided to give it a try.

So that the schedules would be individualized and accurate, we needed to access everyone’s online gradebook. However we wanted to do this without us ever having to touch the data to maximize security, although this turned out to be impossible. The eventual solution was to build a Flask API.

##Front end As mentioned before we used React Native for the front end. We used a Navigator API as well as built in components to structure the frontend seen here:

Here’s the general flow of the app:

  1. THe user is prompted to enter their powerschool (the phs gradebook) username and password. This is then stored ON THE DEVICE - security was our number one priority when building this app.
  2. That username password is securely sent (encrypted AND over SSL) to our Flask API running on an AWS EC2 instance. The users schedule, as well as the letter day is sent back to the device as a JSON.
  3. That json is parsed and displayed to the user in conjunction with the system time to update the user what classes they are supposed to be in at that point in time.
  4. It reloads when the screen is pulled down. ##Back end The backend is what takes a users login credentials and get’s their schedule as well as the letter day. This proved to be quite the challenge as powerschool was not built with scraping in mind (what a surprise). However some beautiful soup and way too much digging later, here’s the code that the server runs that handles all the requests. The code is a bit all over the place but honestly pretty simple at the same time. It takes POST requests, and pulls the json of the different pages on the users dashboard. I did use a custom Python library built by another friend of mine, and this code was built of a project he worked on a few years prior.
from flask import Flask, request, jsonify, abort
import urllib.request, json
import os
from bs4 import BeautifulSoup
from fbbot.infra import Session
import requests
from collections import namedtuple
from urllib.parse import urljoin, urlsplit
import re
import pandas as pd
import lxml
from tabulate import tabulate
from  jsonmerge import merge

# from flask_pushjack import FlaskAPNS
# from flask_pushjack import FlaskGCM
#
# config = {
#     'APNS_CERTIFICATE': '<path/to/certificate.pem>',
#     'GCM_API_KEY': '<api-key>'
# }

app = Flask(__name__)
# app.config.update(config)

@app.route('/')
def webResponse():
    return 'PHS Scheduler'

@app.route('/getInfo', methods=['POST'])
def getInfo():
    username = request.headers.get('username')
    ldappassword = request.headers.get('ldappassword')
    pw = request.headers.get('pw')
    dbpw = request.headers.get('dbpw')

    url = 'https://pschool.princetonk12.org/public/home.html'
    rses = requests.Session()
    lp = rses.get(url)
    fd = Session._Session__form_data(lp.text, 'LoginForm', {
        'account': username,
        'pw': pw,
        'ldappassword': ldappassword,
        'dbpw': dbpw,
    }, form_url=url)
    rses.post(fd.post_url, data=fd.params)

    text = rses.get('https://pschool.princetonk12.org/guardian/home.html').text
    soup = BeautifulSoup(text, 'html.parser')
    tools = soup.find('ul', attrs={'id': 'tools'})
    date = tools.find_all('li')[1]
    match = re.search(r'\(([A-G])\)', date.text)
    if match:
        letterDay =  match.group(1)
    else:
        letterDay = 'No school today'

    text = rses.get('https://pschool.princetonk12.org/guardian/appstudentsched.html').content
    soup = BeautifulSoup(text,'lxml')
    table = soup.find_all('table')[0]
    df = pd.read_html(str(table))
    currentYear = (df[0].to_json(orient='records'))

    text = rses.get('https://pschool.princetonk12.org/guardian/appstudentbellsched.html').content
    soup = BeautifulSoup(text,'lxml')
    table = soup.find_all('table')[0]
    df = pd.read_html(str(table))
    weekly = (df[0].to_json(orient='records'))


    text = rses.get('https://pschool.princetonk12.org/guardian/appstudentmatrixsched.html').content
    soup = BeautifulSoup(text,'lxml')
    table = soup.find_all('table')[0]
    df = pd.read_html(str(table))
    matrix = (df[0].to_json(orient='records'))

    print('LetterDay', letterDay)
    print('currentYear', currentYear)
    print('weekly', weekly)
    print('matrix', matrix)
    all = {"Letter Day": letterDay, "CurrentYear": currentYear, "weekly": weekly, "matrix": matrix}
    return jsonify(all)


@app.route('/getSchedule', methods=['POST'])
def getSchedule():
    username = request.headers.get('username')
    ldappassword = request.headers.get('ldappassword')
    pw = request.headers.get('pw')
    dbpw = request.headers.get('dbpw')
    schedFormat = request.headers.get('format')

    url = 'https://pschool.princetonk12.org/public/home.html'
    rses = requests.Session()
    lp = rses.get(url)
    fd = Session._Session__form_data(lp.text, 'LoginForm', {

        'account': username,
        'pw': pw,
        'ldappassword': ldappassword,
        'dbpw': dbpw,
    }, form_url=url)
    rses.post(fd.post_url, data=fd.params)

    if schedFormat == 'currentYear':
        text = rses.get('https://pschool.princetonk12.org/guardian/appstudentsched.html').content
    elif schedFormat == 'weekly':
        text = rses.get('https://pschool.princetonk12.org/guardian/appstudentbellsched.html').content
    elif schedFormat == 'matrix':
        text = rses.get('https://pschool.princetonk12.org/guardian/appstudentmatrixsched.html').content
    else:
        return 'Invalid schedule type, try currentYear, weekly, or matrix'
    print('ldappassword', ldappassword)
    print('pw', pw)
    print('dbpw', dbpw)
    print('format', schedFormat)
    soup = BeautifulSoup(text,'lxml')
    table = soup.find_all('table')[0]
    df = pd.read_html(str(table))

    return jsonify(df[0].to_json(orient='records'))
    #
    # soup = BeautifulSoup(text,'lxml')
    # table = soup.find_all('table')[0]
    # df = pd.read_html(str(table))
    # print(tabulate(df[0], headers='keys', tablefmt='psql'))
    # return tabulate(df[0], headers='keys', tablefmt='psql')

@app.route('/getLetterDay', methods=['POST'])
def getLetterDay():
    url = 'https://pschool.princetonk12.org/public/home.html'
    rses = requests.Session()
    lp = rses.get(url)
    username = request.headers.get('username')
    ldappassword = request.headers.get('ldappassword')
    pw = request.headers.get('pw')
    dbpw = request.headers.get('dbpw')

    fd = Session._Session__form_data(lp.text, 'LoginForm', {
        'account': username,
        'pw': pw,
        'ldappassword': ldappassword,
        'dbpw': dbpw,
    }, form_url=url)
    rses.post(fd.post_url, data=fd.params)
    text = rses.get('https://pschool.princetonk12.org/guardian/home.html').text
    soup = BeautifulSoup(text, 'html.parser')
    tools = soup.find('ul', attrs={'id': 'tools'})
    date = tools.find_all('li')[1]
    match = re.search(r'\(([A-G])\)', date.text)
    if match:
        return match.group(1)
    else:
        return 'No school today'




class Session:
    FormInfo = namedtuple('FormInfo', ['params', 'post_url'])

    def __init__(self, settings):
        self.settings = settings
        self.req = requests.Session()
        self.req.headers.update({
            'User-Agent': self.settings.user_agent,
        })

    @staticmethod
    def __form_data(text, formid, params, soup=None, form_url=None):
        if type(params) is not dict:
            raise TypeError('Params must be a dict')
        if soup is None:
            soup = BeautifulSoup(text, 'html.parser')
        form = soup.find('form', attrs={'id': formid})
        action = form.attrs.get('action')
        if not urlsplit(action).netloc:
            if form_url is None or not urlsplit(form_url).netloc:
                raise ValueError('kwarg form_url must be specified if form '
                                 'action lacks a host')
            action = urljoin(form_url, action)
        inputs = form.find_all('input') + form.find_all('textarea')
        for i in inputs:
            try:
                name = i.attrs['name']
                type_ = i.attrs['type']
                value = params.get(name)
                if type_ == 'submit':
                    continue
                elif type_ == 'hidden':
                    value = i.attrs['value'] if value is None else value
                elif value is None:
                    raise ValueError('kwarg params dictionary is missing a '
                                     'value for a non-hidden field')
            except KeyError:
                pass
            else:
                params[name] = value
        return Session.FormInfo(params=params, post_url=action)

    def __complete_form(self, form_url, form_id, params, get_params={}):
        page = self.req.get(form_url, params=get_params)
        fd = Session.__form_data(page.text, form_id, params, form_url=form_url)
        self.req.post(fd.post_url, data=fd.params)

    def login(self):
        self.__complete_form(
            'https://mbasic.facebook.com/login.php',
            'login_form',
            {
                'email': self.settings.username,
                'pass': self.settings.password,
            },
        )

    def message(self, user_id, body):
        self.__complete_form(
            'https://mbasic.facebook.com/messages/compose/',
            'composer_form',
            {'body': body},
            get_params={'ids': user_id},
        )