hi there! my first post here 
I’m also facing the same challenges: session and dialog handling.
In the project I’m developing (HR bot), session is a well defined interaction with the chatbot:
- Starts with a greeting (‘hi!’)
- Message exchange (dialog)
- Ends with a greeting (‘bye’)
So in this case the session start and end are well defined.
The issue I’m having is in the “dialog” part: say the chatbot asks you:
"what did you accomplish last week?“
Your answer can be just one message back: “Nothing” (lol) or can be a sequence of messages:
“Well I worked in project X”
“And I helped with proejct Y”
“And I was sick 2 days”
”…"
The challenge is, when does the bot know you’re done with the “complete” answer? In normal user interaction, the tone of voice, the body language, the pauses, etc. will somehow “mark” the end of an answer (and even with all those cues is not 100% accurate). But for a bot who only sees messages is kind of hard.
The only mechanism I thought so far is to give the human a specific timeframe to give an answer. If the answer is identified as an “open ended statement” (through NLP?) and X seconds passed by, then the bot would ask something like “aah, is that it?” (or something between those lines…) otherwise it would go on with the dialog.
But as Joseph said, what defines a session? 
Thanks