[[breakout]]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
breakout [2018/12/27 09:40]
beckmanf [Running long jobs]
breakout [2022/03/26 17:38] (current)
beckmanf [Deskproto] mill link
Line 9: Line 9:
   * 2 x 480 GB SATA3 SSD Intel DC S3500   * 2 x 480 GB SATA3 SSD Intel DC S3500
   * 1 x 400 GB PCIe NVME SSD P3500   * 1 x 400 GB PCIe NVME SSD P3500
 +  * 1 x 12 TB Western Digital DC HC520 SATA3 (12/2021 neu)
   * Intel X540-T2 10GB Base-T Ethernet Netzwerkanschluss   * Intel X540-T2 10GB Base-T Ethernet Netzwerkanschluss
   * 4 x NVIDIA Geforce GTX 1080 mit GP104 Pascal, 2560 Cores, 8 GB RAM   * 4 x NVIDIA Geforce GTX 1080 mit GP104 Pascal, 2560 Cores, 8 GB RAM
-  * Debian ​Linux Jessie, NVIDIA Cuda, Torch +  * Debian ​Buster 
-  * NVidia Treiber ​410.78 +  * NVidia Treiber ​470.82.01 
-  * Kernel ​3.16.51-3 +  * Kernel ​4.19.0-18 
-  * Cuda 10+  * Cuda 11.4 
 +  * Tensorflow, Torch 
 +  * Docker 20.10.11, nvidia-docker2 2.8.0-1
  
 ===== Nutzungshinweise ===== ===== Nutzungshinweise =====
Line 21: Line 24:
  
 <​code>​ <​code>​
-MacBook: ​ssh -p 2222 <​rzaccount>​@hs-augsburg.de+ssh -p 2222 <​rzaccount>​@breakout.hs-augsburg.de
 </​code>​ </​code>​
  
Line 32: Line 35:
  
 <​code>​ <​code>​
-MacBook: ssh -Y -p 2222 <​rzaccount>​@hs-augsburg.de+MacBook: ssh -Y -p 2222 <​rzaccount>​@breakout.hs-augsburg.de
 </​code>​ </​code>​
  
Line 50: Line 53:
  
 <​code>​ <​code>​
-MacBook: ssh -p 2222 fritz@hs-augsburg.de+MacBook: ssh -p 2222 fritz@breakout.hs-augsburg.de
 </​code>​ </​code>​
  
Line 150: Line 153:
 </​code>​ </​code>​
  
-Now you can start a program. You can leave the tmux session (and the program) running when you type CTRL-b d. This will detach you from the tmux session. Then you can logout from you ssh session and keep everything running on the breakout. ​The you can login to breakout via ssh again. You can reattach to tmux with+Now you can start a program. You can leave the tmux session (and the program) running when you type CTRL-b d. This will detach you from the tmux session. Then you can logout from you ssh session and keep everything running on the breakout. ​Then you can login to breakout via ssh again. You can reattach to tmux with
  
 <​code>​ <​code>​
Line 156: Line 159:
 </​code>​ </​code>​
  
-Then you should see the output from your running program.+You should see the output from your running program.
  
 === kerberos - keep your file system alive === === kerberos - keep your file system alive ===
Line 226: Line 229:
 == Start a job with automatic kerberos ticket renew == == Start a job with automatic kerberos ticket renew ==
  
-You can do the ticket renew process automatically. When you start a job with "​krenew",​ then your existing kerberos ticket will be copied to a new ticket cache location and the renew process is automatically done until the renew time expires or the job is done. To start the example from pytorch imagenet training, this would be done like this:+You can do the ticket renew process automatically. When you start a job with "​krenew",​ then your existing kerberos ticket will be copied to a new ticket cache location and the renew process is automatically done until the renew time expires or the job is done. The ticket cache is copied because the kerberos cache that you received at login (here: /​tmp/​krb5cc_12487_ssddef) will be deleted at logout. To start the example from pytorch imagenet training, this would be done like this:
  
 <​code>​ <​code>​
Line 232: Line 235:
 </​code>​ </​code>​
  
-If you do this inside a tmux session, then you detach and logout. The job will run for up to seven days.+If you do this inside a tmux session, then you can detach and logout. The job will run for up to seven days. When you login later you can check the status of the jobs kerberos ticket again with klist. You have to provide the filename of the jobs ticket cache. 
 + 
 +<​code>​ 
 +klist /​tmp/​krb5cc_12487_ftXjk0 
 +</​code>​ 
 + 
 +In my example the new cache name from krenew was /​tmp/​krb5cc_12487_ftXjk0.  
 + 
 +== Login via Public Key Authentication == 
 + 
 +When you login via Public Key Authentication,​ then you do not receive a new kerberos ticket. If you do not have a valid kerberos ticket, then you cannot access "​$HOME/​.ssh/​authorized_keys"​ and you are falling back to default password login and receive a new kerberos ticket. If you did the login via Public Key, then your "​klist"​ will not show any kerberos ticket because that is active from some other login session. However you can still run "​kinit"​ and receive a new kerberos ticket. That will be stored in the default kerberos ticket cache location at "/​tmp/​krb5cc_<​uid>"​
 ==== PyTorch ==== ==== PyTorch ====
  
Line 324: Line 337:
 </​code>​ </​code>​
  
-The training takes about 5 days on the breakout. Refer to "Running long jobs" ​to see how you can run that long jobs on the breakout. +The training takes about 5 days on the breakout. Refer to [[#Running long jobs]] to see how you can run that long jobs on the breakout.
  
 ==== Bauingenieure - Photoscan ==== ==== Bauingenieure - Photoscan ====
Line 497: Line 509:
  
 Once you reconnected to the server, you are ready to use python3 with TensorFlow. Once you reconnected to the server, you are ready to use python3 with TensorFlow.
 +
 +==== Deskproto ====
 +
 +The Deskproto CAM software [[sw-milling|for milling]] is installed and can be started with the GUI. Please start the graphical desktop manager via TurboVNC as described in the [[breakout#​virtualgl_und_turbovnc|TurboVNC chapter]] and launch deskproto from within the desktop manager. ​
 +
 +=== First Time Setup ===
 +
 +The first run of Deskproto requires two setup steps. First run Deskproto from your home directory.
 +
 +<​code>​
 +cd
 +/​opt/​deskproto/​DeskProto71.AppImage
 +</​code>​
 +
 +Select your language, Scaling and choose any machine. We will overwrite that in the next step. Once Deskproto has started, close it. Starting Deskproto for the first time will create two directories
 +
 +<​code>​
 +~/​.local/​share/'​Delft Spline Systems'/​Deskproto
 +~/​.config/'​Delft Spline Systems'​
 +</​code>​
 +
 +which contain drivers, help pages e.t.c. We have the [[sw-milling|StepFour XPERT 1000s]] mill in the lab and use [[https://​www.hufschmied.net|Hufschmied cutters]]. We have added those cutters and the 1000s in this Driver directory /​opt/​deskproto/​Drivers. I have made a setup file which configures our mill and the other driver directory. To use it, copy the setup file to your local place.
 +
 +<​code>​
 +cd
 +cp /​opt/​deskproto/​DeskProto.conf ~/​.config/'​Delft Spline Systems'​
 +</​code>​
 +
 +=== Startup of Deskproto ===
 +
 +After you have overwritten the configuration file, you can start Deskproto. Due to a bug the file access to your nfs mounted home directory is slow. Any file dialog will take quite a while (maybe 2 minutes) to display files in your home directory. You can redefine the HOME variable for deskproto and start it.
 +
 +<​code>​
 +cd
 +HOME=/fast /​opt/​deskproto/​DeskProto71.AppImage
 +</​code>​
 +
 +
 +
  
  
  • breakout.1545900009.txt.gz
  • Last modified: 2018/12/27 09:40
  • by beckmanf