University of California, Irvine    |    Earth System Science Home    |    ESMF Home    |    ESMF SysAdmins
 
 
News
Accounts
Facility
Status
Documentation
Projects
People
Related Links
Wiki
ESMF Home
 
Search


Table of contents
  1. What queues are available for use on the ESMF?
  2. How can I copy data between /ptmp and the RAID (/data)?
  3. How do I run graphical apps on the ESMF and display them on unix or linux?
  4. How do I run graphical apps on the ESMF and display them on microsoft windows?
  5. Why do I get undefined symbols starting with "mpi_"?
  6. How can I ssh (log in) to the esmf from a web browser?
  7. My startup files got messed up. How can I start over?
  8. How can I set up my job to be run under loadleveler?
  9. Why is my loadleveler job getting stuck in the queue? It seems like it should be running by now.
  10. How can I see what processes are running on the systems that make up the ESMF?
  11. My ssh stopped working. How can I fix it?
  12. gcc won't compile any programs
  13. How do I contend with "endian" issues when moving binary data from, for example, linux/x86 to AIX/pSeries (like the ESMF).
  14. How do I compile programs for use with OpenMP?
  15. How do I get a personal directory under /ptmp?
  16. What are the differences between the various filesystems on the ESMF?
  17. How can I tell who is using which nodes?
  18. How do I get help with questions that aren't answered here?
The frequently asked questions and their answers
  1. What queues are available for use on the ESMF?
    Queue (class) name Nodes available via this queue Nodes unavailable to this queue CPU's available per node Wall clock limit Availability
    all_spec 6 of the 8 ways and the 32 way The interactive node 8 on the 8 ways, 32 on the 32 way 40 seconds, 16 minutes, 0 hours, 6 days By approval only
    com_rg32 Just the 32 way All of the nodes with 8 CPU's 32 on the 32 way 24 hours ESS or by approval
    com_sb32 Just the 32 way All of the nodes with 8 CPU's 32 on the 32 way 24 hours Anyone with an account
    com_rg8 6 of the 8 ways The 32 CPU node and the interactive node 8 on the 8 ways 24 hours ESS or by approval
    com_sb8 6 of the 8 ways The 32 CPU node and the interactive node 8 on the 8 ways 4 hours Anyone with an account
    inter_class All nodes None 8 on the 8 ways, 32 on the 32 way 8 hours, 30 minutes Anyone with an account
    com_jkm8 6 of the 8 ways The interactive node and the 32 way node 8 on the 8 ways 24 hours By approval only
    com_sb1 Tied to a single 8 way, node 3 7 of the 8 ways and the 32 way 8 on node 3 24 hours Anyone with an account
    com_rg4 Tied to a single 8 way, node 4 7 of the 8 ways and the 32 way 4 on node 4 20 minutes ESS or by approval
    com_sb4 Tied to a single 8 way, node 4 7 of the 8 ways and the 32 way 4 on node 4 20 minutes Anyone with an account
    com_04a Tied to a single 8 way, node 4 7 of the 8 ways and the 32 way 1 on node 4 7 days By approval only
    com_04b Tied to a single 8 way, node 4 7 of the 8 ways and the 32 way 1 on node 4 7 days By approval only
    com_rg1 Tied to a single 8 way, node 3 7 of the 8 ways and the 32 way 8 on node 3 7 days ESS or by approval




  2. If you need approval to use a queue, please contact your faculty advisor/sponsor, requesting that they in turn speak with Charlie Zender.

    Old queue doc


  3. How can I copy data between /ptmp and the RAID (/data)?
      You can copy data from /ptmp to the RAID using "archive", which is slightly faster than cp or a tar pipeline.

      Usage is like:

      archive -d /ptmp/strombrg/sample-netcdf

      To reverse the direction, that is to copy from the RAID back to /ptmp, you can use:

      archive -d /ptmp/strombrg/sample-netcdf -r


  4. How do I run graphical apps on the ESMF and display them on unix or linux?
    1. If you're on a modern but not bleeding edge version of linux or unix, then you can simply use "ssh -X username@esmf.ess.uci.edu".
    2. If you're on a very recent version of unix or linux (as of Wed Jan 5 11:46:20 PST 2005), you may need "ssh -Y username@esmf.ess.uci.edu" instead.


  5. How do I run graphical apps on the ESMF and display them on microsoft windows?

    So far, we have tried three methods:

    1. Download cygwin (you can find it with google - it's free software). Go through the install program. Make sure to scroll to the bottom of the optional packages, and select the various parts of xfree86. You'll also want the windows version of openssh, which probably comes from cygwin as well. Ask for icons at the end of the install. After Xfree86 is installed, do the following each time you start a new session on the ESMF:
      1. Double click on the cygwin icon
      2. Type "startx" in the cygwin window (and hit enter)
      3. In the xterm window that pops up, say:
          ssh -X myname@esmf.ess.uci.edu
        This should log you into the interactive node of the ESMF, and you should be able to run unix/linux graphical apps, in a secure manner, as long as you don't su.
    2. Another (usually) free possibility is VNC. There are many compatible implementations of VNC, some commercial, some not. We have "TightVNC" installed on the ESMF. VNC is more reliable than Cygwin/XFree86 for some tasks. It's also very fast over low bandwidth links, especially if you tunnel it over zebedee. However, VNC is not a secure protocol out of the box; it's best to use VNC only if you're comfortable with manually setting up tunneling.
    3. A commercial solution is XWin32. My understanding is that Xwin32 version 5 is not especially secure, but XWin32 version 6 has a nice, built-in ssh tunneling option. However, it appears that putty (a free ssh client) can be used to forward X11 credentials in combination with XWin32 version 5. Following are a couple of URL's about XWin32:
    4. A fourth option, one we haven't tried yet, is the XLiveCD


  6. Why do I get undefined symbols starting with "mpi_"?
      You've most likely compiled your program with a mix of MPI and non-MPI, particularly compiling some .o's with MPI but then linking without. The error can look like:

      ld: 0711-317 ERROR: Undefined symbol: .mpi_initialized
      ld: 0711-317 ERROR: Undefined symbol: .mpi_abort

      To fix, remove all of your .o's and .a's, and recompile from the beginning. With clm, this most likely means you need to rm -r your obj directory and then reuse your run script.


  7. How can I ssh (log in) to the esmf from a web browser?
      If you are using a java-enabled web browser, or have permission to install a java plugin into your web browser, then you can go to this page and use the hostname esmf.ess.uci.edu, and your usual username and password. The entire transmission should be encrypted, and hence safe.

      Note that this will not allow you to run graphical applications.


  8. My startup files got messed up. How can I start over?
      cd
      tar xvf /usr/local/startup-files/startup-files.tar
      
  9. How can I set up my job to be run under loadleveler?
    • Please look at the ~strombrg/quick-test directory. It contains a very simple loadleveler job that just shows what hosts it should run on. If you make a directory and copy this script into it, you should then be able to run "llsubmit llsubmit-me", the job should be enqueued into loadleveler.
    • If you want to run this "quick-test", please make a copy of it in your own home directory first. By default, loadleveler jobs want write access to the directory from which you submit the job.
  10. Why is my loadleveler job getting stuck in the queue? It seems like it should be running by now.
    • Please run "llq -s <jobname>". This won't always be helpful, but often it will identify why your job is waiting. In fact, it's a good idea to run this command about 5 minutes after it's submitted, in case llq -s can help you spot a permissions problem right away, before you've waited a long time for your job to run.
    • Interpreting the output
      Perhaps the most important thing to check is the "task_geometry" definition. The highest number on that line is the number of CPU's you're requesting, and the number of matched parentheses is the number of nodes you're requesting. The ESMF has 7 nodes with 8 CPU's each (one of which is the interactive node), and one node with 32 CPU's. If you are requesting more CPU's than are available via your geometry, your job is probably going to get stuck.

  11. How can I see what processes are running on the systems that make up the ESMF?
    • Please run "sudo cluster-ps" to see all processes running, in no particular order. You'll need to enter your own password to run this command.
    • If you want to see the processes sorted in descending order by how much virtual memory they are taking up, please run "sudo cluster-ps | sort -nr +5 | less -sc". This should sort the 6th field of cluster-ps' output numerically, in reverse order.

  12. My ssh stopped working. How can I fix it?
    • For Unix or Linux ssh client
      • Please run: "cp /dev/null ~/.ssh/known_hosts"
    • For a Windows ssh client
      • Please go to this URL and follow the directions there for your ssh client(s) of choice.
  13. gcc won't compile any programs
    • If you see errors like the following from gcc compiles:
        gcc -o prog    prog.c
        Assembler:
        /tmp//ccedk7vc.s: line 9: Only .llong should be used for relocatable 
        expressions.
        
    • ...then please be sure to compile with "gcc -maix64", or "export CC='gcc -maix64' and rerun configure
    • The apparent reason this is happening, is that we have OBJECT_MODE set to 64 by default in most accounts, in order to get xlc and friends to produce 64 bit objects by default. This causes problems for gcc, because it does not understand this variable, and defaults to producing 32 bit executables. So when gcc automatically invokes the system assembler, it expects 64 bit assembler source code, and you get the errors you're seeing.
  14. How to contend with endian issues
  15. How do I compile OpenMP programs?
  16. How do I get a personal directory under /ptmp?
    • Please run /usr/local/bin/mkptmpdir - however new accounts should be getting this directory created automatically now.
  17. What are the differences between the various filesystems on the ESMF?
    • /ptmp is faster than /data if you use it on esm04
    • I haven't compared the relative speeds of /ptmp and /data on the other nodes: 1-3, 5-8, but both filesystems'd be accessed via a network, which is frequently, but not always, slower than local storage.
    • /data is much larger than /ptmp or /datashare or /home
    • /ptmp is intended to be a fast place to temporarily store computation results
    • /data is intended to be a place to copy data to if you have some computation results you want to keep longer term - but you're responsible for your own backups!
    • /datashare is similar to /ptmp in that both are local to esmf04, but /datashare intended for data used by a number of people on the ESMF, while /ptmp is more appropriate for data only one ESMF user is likely to use.
    • /datashare might be a good place for this data - I've CC'd Charlie to see what he thinks.
    • /home is backed up.
    • /ptmp, /data and /datashare are not backed up!.
  18. How can I tell who is using which nodes?
    • Please type the command "esmfusers"
  19. How do I get help with questions that aren't answered above?
    • Please call x4-0189 or e-mail dcs at uci dot edu

 

Return to top


Website designed and maintained by ESMF and Network and Academic Computing Services
Address questions and comments to ESMF System Administrators